Executive Summary

Predictive modeling is a powerful tool that leverages statistical techniques to forecast outcomes. It is widely used across various industries, from healthcare to finance, to make informed decisions based on historical data. This tutorial aims to introduce a framework for developing predictive models in the context of actuarial experience studies. To ground this framework within the context of real actuarial problems, we will also specifically look to understand and model the differences in mortality by product (whole life, term, etc.) With the modeling approach we can see that

  • The relative spread of preferred mortality differs by product.
    • For 2-class preferred systems, the residual standard mortality is much higher than preferred for term than for other products.
    • The spread for UL/VL/ULSG/VLSG for 4-class preferred systems is much wider than for other products.
  • There are divergences in the spread of face amount factors for xl, Perm, and Term, with xL narrowing relative to Term.
  • The issue age slope appears to be steeper for Term than Perm and xL under age 65. However, differences emerge above issue age 65, with the slope for Perm steepening relative to xL.
  • There are differences among the products in durations 1 and 2.
  • Since issues years 1990-1999, there has been a small but steady increase in relative mortality for xL vs Term, with xL now approaching Term.

Data

For what follows, we used a filtered and summarized subset of the Society of Actuaries’ Individual Life Experience Committee mortality data. Columns included in the extract were

  • Number of Preferred Classes
  • Preferred Class
  • Smoker Status
  • Face Amount Band
  • Observation_Year
  • Duration
  • Issue Age
  • Insurance Plan
  • Anticipated Level Term Period
  • Issue Year
  • Sex
  • Death Count
  • Death Claim Amount
  • Tabular Expected Mortality by Count - 2015VBT
  • Tabular Expected Mortality by Amount - 2015VBT

The data were filtered as

  • Issue ages 18 and greater
  • Durations 25 and less
  • Experience years 2013-2017

and then grouped or combined as

  • Underwriting: concatenation of smoker status, number of preferred classes, and preferred class, in that order
  • Duration: 1, 2, 3, 4-5, 6-15, 16-25
  • Issue Age: 18-24, 25-34, 35-44, 45-54, 55-64, 65-74, 75-84, 85-99
  • Issue Years: 1900-1989, 1990-1999, 2000-2009, 2010+
  • Insurance Plan: UL, ULSG, VL, VLSG collapsed into category “xL”
  • Face Amount Band: face amounts under 50,000 grouped into a single category, face amounts 1 million and greater grouped into a single category

The intent of this heavy grouping and summarization was to enable running this document with modest computing resources. The source data can be replaced with a similarly constructed dataset with finer grouped variables.

The code to generate these files can be found in the datafiles subfolder. It relies on an unpublished version of the ILEC dataset which has been restructured using the Arrow framework into a collection of Parquet files. A knowledgeable reader should be able to adapt the code to whatever environment in which they keep their own copy of the ILEC dataset.

Machine Learning in Mortality Studies

Experience studies are a primary tool that actuaries use to quantify and understand historical experience. It is a natural next step to apply statistical techniques to experience studies to discover new and relevant insights. There are multiple advantages to this approach:

  • Allows the actuary to avoid cumbersome and potentially misleading univariate analysis
  • Allows the actuary to appropriately consider credibility and unlock all the credibility inherent in the data
  • Makes it easier to discover and appropriately adjust for variable interactions
  • Enables the actuary the ability to statistically control for the different sources of variation in any given cell of a mortality study.

Problem Statement

One question of interest to actuaries is why different products exhibit different mortality outcomes. Even though they can be difficult to separately identify and quantify, it is known that underwriting, target market, policyholder behavior, and socioeconomic factors, among others, have direct bearing on mortality outcomes. With a statistical or machine learning model we have a possible solution to account for the impact of these variables. For this project, the key question we are trying to answer is how mortality varies by product in the Individual Life Experience Committee dataset. In the simplified dataset that is used herein, the product categories are Term, Perm, UL/VL, and Other. To understand the differences in mortality by product, we will construct machine learning models to predict the mortality outcomes and analyze the results for relevant insights.

Methodology

The framework will guide the process of code setup, model creation, preprocessing, and validation. It will also address common challenges often encountered such as: incorporating nonlinear relationships, determining interactions, dealing with underfitting and overfitting (bias-variance trade-off), and model interpretability. The goal of this project is to provide useful techniques, code, and ideas, to actuaries to guide future analysis of mortality studies.

There are several common key steps in any modeling process: data preprocessing, data exploration, model selection, model validation, and model interpretation. Much more can be written on these topics than we have the space to explore, and we aim to address the key considerations as they pertain to experience studies.

Modeling Approaches

When applying statistics and machine learning to experience studies, there are multiple different modeling approaches one might take. We will focus our attention on the most common approaches used: generalized linear models (GLMs), generalized linear models with penalization (also known as elastic net GLMs), and gradient boosting machines (GBMs or GBDTs). Many other approaches or variations on these approaches are also reasonable.

Generalized Linear Models (GLM)

Generalized linear models have the most history of the methods that we will examine and in some sense are the simplest. One of the benefits of GLMs is that they allow statistical hypothesis testing. For instance, individual model coefficients can be statistically tested, and various statistical tests can be performed to validate results and compare models. The results of GLMs are also relatively simple to interpret. However, GLMs have a few disadvantages: due to their relative simplicity, they have lower predictive power than other methods. To get the best performance out of a GLM, additional effort is needed to capture nonlinear relationships and interactions. Ultimately, this can make them more time-intensive than other methodologies.

GLMs can be extended into regularized GLMs, such as LASSO or Ridge, which modifies the objective used to fit the model. This regularization term offers several advantages, disadvantages, and changes to the modeling process. First, the addition of penalization makes confidence intervals and hypothesis testing infeasible. Instead of using hypothesis testing on coefficients and likelihood ratio tests to evaluate relative fitness of models, we apply a machine learning paradigm by optimizing our model using cross-validation. Fortunately, a regularized GLM still maintains the nice interpretability of a linear model, and it can increase the overall predictive accuracy of the model. Additionally, by using a LASSO penalty, it can perform automatic feature selection.

Gradient-Boosted Decision Trees (GBDT)

Gradient boosted decision trees are an ensemble of decision trees generated in a stage-wise fashion. Each decision tree is recursively trained on the residuals of the previous tree. The first tree is a decision tree on the outcome, the second the residuals on that, and so on. In this way, the model is continually refocusing on where its predictions are weakest. Popular frameworks for gradient boosted decision trees include LightGBM, CatBoost, and XGBoost. This model is one of the most effective methods for classification and regression for tabular data.

Gradient boosting machines (applied here with LightGBM) have become the go-to approach in many tabular machine learning tasks due to their very high accuracy, ease of use, and ability to seamlessly discover important interactions. However, they can also be the most complex to interpret. To aid in interpretation, we will discuss the use of SHAP values, which is a popular method of interpretation.

Model Explanation

Ordered Lorenz Plot and Gini

An ordered Lorenz curve and the associated Gini coefficient measure the ability of a model to stratify risk. An ordered Lorenz curve is created using the model prediction as an index. Using this index, we graph the cumulative percentage of claims vs the cumulative percentage of exposure. The more bowed this line, the better the model is able to predict the outcome. The Gini Index measures the difference between this line and perfect equality. The more your model is able to predict risk, the more unequal the distribution of claims is between the model prediction, and thus the larger the Gini coefficient.

Lift Plot

There are several different varieties of lift plots used in connection with machine learning. These plots are used to help visually understand the risk stratification and accuracy of a model. As presented here, lift plots sort the model predictions into deciles based upon the predicted value. For each decile, the model’s average prediction for that cell is graphed vs the value seen empirically in the data. The more these two values are in agreement, the better the model is performing.

SHAP

SHAP values are a method of model interpretation in machine learning and originally come from Shapley values in economics. SHAP values measure the impact each feature has on the prediction for a particular instance. This numeric score indicates how much each feature contributed to the prediction in terms of sign and magnitude.

Feature Importance

Feature importance is a global measure of how much a variable contributes to the predictions of a target variable within a model. This can be helpful in interpreting a model to understand the key drivers in aggregate. However, unlike SHAP values, feature importance does not help you interpret individual predictions. Feature importance is usually presented in terms of percent contribution. When done so, a feature importance of 20% for a feature would implies that 20% of the overall reduction in prediction error is attributable to that particular feature. There are multiple ways of measuring feature importance. One of the simplest and most intuitive is permutation feature importance. Using this method, you scramble a particular feature so that it is no longer useful and measure the percent difference in model performance before and after this change. The change in error would be the importance.

The reader should be cautioned that a low relative importance does not imply lack of significance or of predictive value. For example, gender is a well-known predictor of mortality. The variation explainable by other factors of the data can greatly exceed the variation arising from gender, and interactions with other variables like age can further rob gender of importance attributed to it. The effect then is to push gender down the feature importance list.

Goodness-of-Fit

No matter how well a model may behave on measures of feature importance, lift, Lorenz and Gini indices, mean square error, deviance, and so on, it is nonetheless important to check goodness-of-fit. Goodness-of-fit checks allow us to see how well a model reproduces the phenomena of interest. For our purposes, this is the same as checking ratios of actual claims to model predicted claims. In each model section, there are univariate and bivariate goodness-of-fit tables. Ideally, we should see 100% for all entries. For the GLM model and the univariate goodness-of-fit checks, we will see this throughout the tables of goodness-of-fit, as a non-penalized GLM will reproduce the margins for any included categorical variable or interaction of categorical variables. For space reasons, we omit a test for ratios significantly different from 100%. However, qualitatively, ratios far from 100%, perhaps +/- 5% or +/- 10%, should be deemed as evidence of poor fit for that cell.

Framework Preparation

Before getting to the core data analysis task, we need to first prepare the R environment by configuring display and model options, loading necessary libraries. Then, we load the data and prep for running data analysis and modeling. This section also reads in the dataset and splits the data into training and testing sets based on the observation year. Additionally, most parameters are set here. If prototyping is enabled, it creates a smaller subset of the training data for quicker processing.

Here, we set display options.

#-----------------------------------------#
##### Display Options #####
#-----------------------------------------#
## turn off scientific notation
options(scipen = 999)

## change how many digits to display
options(digits=4)

## Suppress warnings
options(warn = -1)

## Determine which output we are generating
## This will be html, docx, or pdf
if(interactive()) {
  output_format <- "html"
} else {
  output_format <- knitr::opts_knit$get("rmarkdown.pandoc.to")  
}

Here we set model options for GLMNET and LightGBM.

#-----------------------------------------#
##### Model Options #####
#-----------------------------------------#

## When TRUE, only a fraction of the data is used, drastically reducing runtimes
prototype      <- FALSE
prototype_size <- 50000  ## number of records in data fraction
nTrainSeed     <- 42     ## seed to use when splitting the data for reproducibility

## GLMNet parameters
nGLMNetCores                <- 10
nInteractionDepth           <- 1
fGLMNetAlpha                <- 0.5
nUseTopLightGBMInteractions <- "ALL" # Integer or "ALL"
bUseSparse                  <- TRUE
nELSeed                     <- 13579

## LightGBM parameters 
flgbm_vis_subset     <- 0.1
bFullInteractions    <- FALSE  # Very slow in default configuration
nPlotTopFeatures     <- 3
nPlotTopInteractions <- 3
nGBMSeed             <- 1337

## Flags for running specific models
runGLM      <- TRUE
runLightGBM <- TRUE
runGLMInt   <- TRUE

Here we load all required libraries.

#-----------------------------------------#
##### Required libraries #####
#-----------------------------------------#
## Less verbose tidyverse
options (tidyverse.quiet = TRUE)

## We use bUseGroundhog in the R codespaces to control versioning,
## If you have your own setup, set to FALSE.
## Things may nonetheless break, so use at your own risk.
bUseGroundhog <- FALSE  

if(bUseGroundhog) {
  pkgDate <- "2024-05-09"
  
  suppressPackageStartupMessages( {
      library(groundhog)
      
      groundhog.library(pre,pkgDate)
      groundhog.library(lightgbm,pkgDate)
      groundhog.library(data.table,pkgDate)  
      groundhog.library(lmtest,pkgDate)
      groundhog.library(glmnet,pkgDate)
      groundhog.library(dplyr,pkgDate)
      groundhog.library(EIX,pkgDate)
      groundhog.library(ggplot2,pkgDate)
      groundhog.library(tidyr,pkgDate)
      groundhog.library(doParallel,pkgDate)
      groundhog.library(tidyverse,pkgDate)
      groundhog.library(magrittr,pkgDate)
      groundhog.library(dtplyr,pkgDate)
      groundhog.library(flextable,pkgDate)
      groundhog.library(ftExtra,pkgDate)
      groundhog.library(arrow,pkgDate)
      groundhog.library(here,pkgDate)
      groundhog.library(shapviz,pkgDate)
      groundhog.library(patchwork,pkgDate)
      groundhog.library(Matrix,pkgDate)
      groundhog.library(MatrixModels,pkgDate)
      groundhog.library(openxlsx,pkgDate)
      groundhog.library(flexlsx,pkgDate)
    }
  )

} else {
  suppressPackageStartupMessages({
    library(pre)
    library(lightgbm)
    library(data.table)  
    library(lmtest)
    library(glmnet)
    library(dplyr)
    library(EIX)
    library(ggplot2)
    library(tidyr)
    library(doParallel)
    library(tidyverse)
    library(magrittr)
    library(dtplyr)
    library(flextable)
    library(ftExtra)
    library(arrow)
    library(here)
    library(shapviz)
    library(patchwork)
    library(Matrix)
    library(MatrixModels)
    library(openxlsx2)
    library(flexlsx)
  })
}

#-----------------------------------------#
##### Set Folder Locations #####
#-----------------------------------------#

source("R/functions.R")
source("R/glmnet_support.R")

## set library location
local_libraries <- FALSE
if(local_libraries)
{
  library.dir <- 'D:\\Data\\Niemerg\\Life Predictive Mortality POG\\01 library'
  .libPaths(new = library.dir)
}

bDebug <- FALSE

## Save the expensive working objects and reload if they exist. If this is true  
## and they do not exist, the computations will rerun.
bUseCache <- TRUE
bInvalidateCaches <- FALSE

#-----------------------------------------#
##### Data Options #####
#-----------------------------------------#

src_file <- 'http://finriskanalytics-ilecdata.s3-website-us-east-1.amazonaws.com/ilec13_17_framework_light.parquet'

cacheFileRoot <- file.path(
  getwd(),
  "objectcache",
  tools::file_path_sans_ext(
    tail(
      unlist(strsplit(src_file,"/")),
      1
    )
  )
)

exportsRoot <- file.path(
  getwd(),
  "render_exports",
  tools::file_path_sans_ext(
    tail(
      unlist(strsplit(src_file,"/")),
      1
    )
  )
)

#-----------------------------------------#
##### Modeling Parameters #####
#-----------------------------------------#
resp_var    <- "amount_actual"
resp_offset <- "amount_2015vbt"

pred_cols <- c("uw",
               "face_amount_band",
               "dur_band1",
               "ia_band1",
               "gender",
               "insurance_plan",
               "ltp",
               "iy_band1")

factor_cols <- c("uw",
               "face_amount_band",
               "dur_band1",
               "ia_band1",
               "gender",
               "insurance_plan",
               "ltp",
               "iy_band1")

Here we read the data, convert specified columns to categorical factors to ensure proper data handling and adjust the labels for the ‘face_amount_band’ factor to avoid issues in model outputs. The dataset is split into training and testing sets based on the observation year, with the year 2017 data used for validation. If prototyping is enabled, the code further subsets the training data to a smaller size for faster processing, ensuring reproducibility by setting a seed before shuffling and selecting the subset.

#-----------------------------------------#
##### Load dataset #####
#-----------------------------------------#

## Determine the file extension of the source file
file_type <- tools::file_ext(src_file)

## Read the dataset based on file extension
if (file_type == "csv") {
  ## Load a CSV file
  ds <- fread(src_file)
} else if (file_type == "parquet") {
  ## Load a Parquet file and convert it to a data table
  if(src_file %like% "amazonaws.com") {
    s3b <- s3_bucket(strsplit(urltools::domain(src_file),".",fixed=T)[[1]][1])
    ds <- read_parquet(s3b$path(urltools::path(src_file)))
  } else {
    ds <- arrow::read_parquet(src_file) %>% as.data.table()
  }
}

#-----------------------------------------#
##### Convert columns to factors #####
#-----------------------------------------#

## This step ensures that categorical data is appropriately treated as such.
ds[, (factor_cols) := lapply(.SD, factor), .SDcols = factor_cols]

## Adjust the labels for the 'face_amount_band' factor
## Replace colons in factor names with hyphens to avoid issues in model outputs
ds[, face_amount_band := fct_relabel(
  face_amount_band,
  function(x) sub(":", " -", x, fixed = TRUE)
)]

#-----------------------------------------#
##### Set train / test #####
#-----------------------------------------#

## Split the dataset into training and testing sets based on the observation year
## The year 2017 is used as the validation set
train <- ds[observation_year != 2017]
test  <- ds[observation_year == 2017]

## Subset the data for prototyping purposes
## This code block is executed if a prototype subset is requested
if (prototype) {
  ## Set the seed for reproducibility
  set.seed(nTrainSeed)
  
  ## Shuffle and select a subset of the training data
  train <- train[sample(1:nrow(train)), ]
  train <- train[1:prototype_size]
}

Models

GLM

Below is an analysis using a main-effects GLM to better understand the data. We integrate two modeling approaches:

  • Standard GLM analysis with model calibration (but no model building), including presentation of coefficients and residuals analysis, and
  • An approach to explore the interactions of the main effects model along selected dimensions of the data, checking the average main effects for those subsets, weighted according to the offset used in the analysis.

Model Summary

## Construct the model formula from predictor columns
modelFormula <- paste(pred_cols, collapse = " + ")

## Include offset in the model formula if it exists
if (exists("resp_offset")) {
  modelFormula <- paste(
    modelFormula,
    paste0("offset(log(", resp_offset, "))"),
    sep = " + "
  )
}

## Complete the model formula with the response variable
modelFormula <- paste(resp_var, modelFormula, sep = " ~ ")
modelFormula <- as.formula(modelFormula)

## Check if the cached model exists and is valid
if (bUseCache & file.exists(paste0(cacheFileRoot, "_glm_model.rds")) & !bInvalidateCaches) {
  ## Load the cached model if it exists
  modelGLM <- readRDS(paste0(cacheFileRoot, "_glm_model.rds"))
} else {
  ## Fit the GLM model using the specified formula and data
  modelGLM <- glm(
    formula = modelFormula, 
    family = quasipoisson, 
    data = train,
    x = FALSE,
    y = FALSE,
    model = FALSE
  )
  ## Remove the data from the model object to reduce size
  modelGLM$data <- c()
  
  ## Save the fitted model to cache if caching is enabled
  if (bUseCache) {
    saveRDS(modelGLM, paste0(cacheFileRoot, "_glm_model.rds"))
  }
}

## Append predictions to the dataset
ds[, predictions_glm := predict(modelGLM, newdata = .SD, type = "response")]
train[, predictions_glm := predict(modelGLM, newdata = .SD, type = "response")]
test[, predictions_glm := predict(modelGLM, newdata = .SD, type = "response")]

Model, Table of Coefficients, ANOVA

Model Summary

Below is the table of coefficients for the fitted GLM. Each entry is a coefficient in the table for the level of the indicated variable. The estimate and standard errors are on the scale of the linear predictor. For a Poisson model with log link, this means they are on the log scale.

## Show summary of the model
modelGLM %>% 
  as_flextable() %>%
  set_table_properties(opts_html=list(
    scroll=list(
      add_css="max-height: 500px;"
      )
    )
    )

Estimate

Standard Error

z value

Pr(>|z|)

(Intercept)

0.978

0.091

10.743

0.0000

***

uwN/2/1

-0.152

0.011

-13.931

0.0000

***

uwN/2/2

0.248

0.010

23.945

0.0000

***

uwN/3/1

-0.324

0.014

-23.614

0.0000

***

uwN/3/2

-0.180

0.012

-14.802

0.0000

***

uwN/3/3

0.152

0.011

14.077

0.0000

***

uwN/4/1

-0.330

0.014

-23.555

0.0000

***

uwN/4/2

-0.166

0.015

-11.060

0.0000

***

uwN/4/3

0.011

0.017

0.630

0.5287

uwN/4/4

0.208

0.016

12.618

0.0000

***

uwS/1/1

0.068

0.014

4.778

0.0000

***

uwS/2/1

-0.179

0.021

-8.637

0.0000

***

uwS/2/2

0.112

0.022

5.178

0.0000

***

uwU/1/1

0.261

0.038

6.794

0.0000

***

face_amount_band04 - 50,000 - 99,999

-0.113

0.018

-6.410

0.0000

***

face_amount_band05 - 100,000 - 249,999

-0.236

0.015

-15.413

0.0000

***

face_amount_band06 - 250,000 - 499,999

-0.299

0.016

-18.893

0.0000

***

face_amount_band07 - 500,000 - 999,999

-0.322

0.016

-20.258

0.0000

***

face_amount_band08 - 1,000,000+

-0.349

0.015

-23.004

0.0000

***

dur_band102

0.010

0.030

0.341

0.7334

dur_band103

0.000

0.029

0.006

0.9951

dur_band104-05

-0.044

0.026

-1.686

0.0917

.

dur_band106-15

-0.115

0.028

-4.084

0.0000

***

dur_band116-25

-0.096

0.031

-3.088

0.0020

**

ia_band125-34

-0.062

0.034

-1.813

0.0699

.

ia_band135-44

-0.045

0.033

-1.347

0.1779

ia_band145-54

-0.075

0.033

-2.258

0.0239

*

ia_band155-64

-0.100

0.033

-3.024

0.0025

**

ia_band165-74

-0.082

0.033

-2.440

0.0147

*

ia_band175-84

-0.173

0.034

-5.100

0.0000

***

ia_band185-99

-0.304

0.040

-7.581

0.0000

***

genderM

0.011

0.006

1.790

0.0735

.

insurance_planPerm

-0.134

0.062

-2.159

0.0308

*

insurance_planTerm

-0.275

0.069

-3.977

0.0001

***

insurance_planxL

-0.042

0.061

-0.681

0.4958

ltp10 yr

-0.090

0.034

-2.650

0.0080

**

ltp15 yr

-0.097

0.034

-2.826

0.0047

**

ltp20 yr

-0.207

0.033

-6.286

0.0000

***

ltp25 yr

-0.247

0.049

-5.077

0.0000

***

ltp30 yr

-0.161

0.036

-4.502

0.0000

***

ltpNot Level Term

-0.396

0.045

-8.750

0.0000

***

ltpUnknown

-0.207

0.036

-5.804

0.0000

***

iy_band11990-1999

-0.073

0.023

-3.209

0.0013

**

iy_band12000-2009

-0.148

0.026

-5.706

0.0000

***

iy_band12010+

-0.206

0.030

-6.857

0.0000

***

Signif. codes: 0 <= '***' < 0.001 < '**' < 0.01 < '*' < 0.05

(Dispersion parameter for quasipoisson family taken to be 700118)

Null deviance: 6.172e+10 on 249979 degrees of freedom

Residual deviance: 5.662e+10 on 249935 degrees of freedom

ANOVA

The ANOVA table displays the analysis of deviance for the GLM. For each variable, we see the proportion of deviance explained by that variable and its associated degrees of freedom.

## Show the ANOVA table

## Check if the cached ANOVA results exist and are valid
if (bUseCache & file.exists(paste0(cacheFileRoot, "_glm_model_anova.rds")) & !bInvalidateCaches) {
  ## Load the cached ANOVA results if they exist
  mod.glm.anova <- readRDS(paste0(cacheFileRoot, "_glm_model_anova.rds"))
} else {
  ## Perform ANOVA on the GLM model
  mod.glm.anova <- anova(modelGLM, test = "Chisq")
  
  ## Save the ANOVA results to cache if caching is enabled
  if (bUseCache) {
    saveRDS(mod.glm.anova, paste0(cacheFileRoot, "_glm_model_anova.rds"))
  }
}

## Convert the ANOVA results to a data table, add feature names, and format the table
mod.glm.anova %>% 
  as.data.table() %>%  # Convert to data table
  add_column(rownames(mod.glm.anova), .before = 1) %>%  # Add feature names as a new column
  setnames(old = 1, new = "feature") %>%  # Rename the new column to "feature"
  flextable() %>%  # Create a flextable for formatting
  set_formatter(
    `Pr(>Chi)` = function(x) ifelse(x < 0.01, "< 0.1%", sprintf("%1.2f%%", 100 * x))  # Format p-values
  )

feature

Df

Deviance

Resid. Df

Resid. Dev

Pr(>Chi)

NULL

249,979

61,716,566,139

uw

13

3,858,650,451

249,966

57,857,915,688

< 0.1%

face_amount_band

5

701,275,400

249,961

57,156,640,288

< 0.1%

dur_band1

5

118,698,616

249,956

57,037,941,672

< 0.1%

ia_band1

7

162,999,155

249,949

56,874,942,517

< 0.1%

gender

1

3,395,713

249,948

56,871,546,804

2.76%

insurance_plan

3

57,762,926

249,945

56,813,783,878

< 0.1%

ltp

7

152,501,012

249,938

56,661,282,865

< 0.1%

iy_band1

3

41,049,822

249,935

56,620,233,043

< 0.1%

Lift

The lift plot compares the GLM against the underlying mortality table.

## generate lift plot
test[,decile.table(get(resp_var),predictions_glm/get(resp_offset),get(resp_offset))] %>%
  pivot_longer(-c(decile,exposures)) %>% 
  as.data.table() %>%
  ggplot(aes(x=decile, y=value, col=name)) +  
  geom_line() +
  scale_x_continuous(breaks=c(1:10)) +
  labs(x="Decile",y="Response")+
  ggtitle("Decile Lift Plot") +
  theme_minimal() +
  scale_y_continuous(labels=scales::comma)

Lorenz Plot

The Lorenz plot demonstrates a model’s ability to stratify predictions against a null baseline.

## lorenz plot
test[,lorenz(get(resp_var), predictions_glm / get(resp_offset), get(resp_offset))]

The table of coefficients shows a number of interesting phenomena and perhaps some surprises:

  • Gender is not significant. Since we are using an offset of tabular expected rates, the interpretation is that the underlying differentials in the tabular expected rates are adequate for the current data, after adjusting for other factors.
  • Underwriting is the most influential factor from the ANOVA perspective.
  • Both the most recent issue years (2010+) and the most recent durations show significant mortality factors. Durations 1 and 2 are significantly higher than durations 3+
  • While insurance plans other than “Other” are significantly different from 0, a quick glance at the effects plot shows that the UL/VL plans are not significantly different from each other and with Perm, while Term is borderline significantly different from UL/VL.
  • Face amount bands 250K and greater have factors not significant from one another.

Model Illustrations and Graphics

Effects Plots

Because the model contains every column, this is equivalent to computing the marginal actual-to-tabular ratios. However, the model also provides standard errors, which is useful for assessing the significance of the marginal ratios.

## Plot amount model terms

## Calculate the dispersion parameter for the GLM model
glm_disp <- sum(modelGLM$residuals^2 * modelGLM$weights) / modelGLM$df.residual

## Generate plots for each predictor column
lapply(pred_cols, function(s) {
  ## Summarize the data for the current predictor
  p <- ds[, .(
      predicted = sum(predictions_glm) / sum(amount_2015vbt),
      stde = sqrt(sum(predictions_glm) * glm_disp) / sum(amount_2015vbt)
    ), by = c(s)] %>%
    setnames(s, "x") %>%  # Rename the grouping column to "x"
    mutate(
      x = fct_relevel(
        x,
        sort(levels(x))  # Reorder factor levels
      )
    ) %>%
    as.data.table() %>%
    ## Create the plot
    ggplot(aes(x = x, y = predicted)) +
    geom_point() +  # Add points for the predicted values
    geom_errorbar(
      aes(
        ymin = predicted - 1.96 * stde,
        ymax = predicted + 1.96 * stde
      )  # Add error bars for the 95% confidence interval
    ) +
    geom_hline(yintercept = 1, linetype = 2) +  # Add a horizontal line at y = 1
    scale_y_continuous(
      name = "Factor",
      labels = scales::percent  # Format y-axis labels as percentages
    ) +
    scale_x_discrete(name = s) +  # Set x-axis name to the current predictor
    theme_minimal() +  # Use a minimal theme for the plot
    theme(
      axis.text.x = element_text(angle = 45)  # Rotate x-axis text for readability
    )
  
  return(p)  # Return the plot
}) %>%
  purrr::set_names(pred_cols) %>%  # Set names for each plot based on predictor columns
  iwalk(~ {
    cat('##### ', .y, '\n\n')  # Print the plot title
    print(.x)  # Print the plot
    cat('\n\n')  # Add spacing after each plot
  })
uw

face_amount_band

dur_band1

ia_band1

gender

insurance_plan

ltp

iy_band1

Goodness-of-Fit

Goodness-of-fit tables are provided. Each table provides actual-to-model ratios for single variables and for 2-way combinations of variables. A model is qualitatively deemed to perform well if goodness-of-fit ratios are close to 100% in almost all situations. The quantitative assessment using significance testing is omitted here.

Unvariate Goodness-of-Fit
## Generate summary tables for each factor column and format them
map(factor_cols, .f = \(x) {
  ## Convert column name to symbol for tidy evaluation
  x <- sym(x)
  resp_var_sym <- resp_var
  
  ## Summarize data by the current factor column
  train %>%
    group_by(!!x) %>%
    summarize(
      Outcome = sum(amount_actual),  # Sum the actual amounts
      AM = sum(amount_actual) / sum(predictions_glm)  # Calculate the Actual-to-Model ratio
    ) %>%
    ## Create a flextable for the summarized data
    flextable() %>%
    ## Format the Actual-to-Model column as percentages
    set_formatter(
      AM = function(x) {
        if (is.numeric(x))
          sprintf("%.1f%%", x * 100)
        else
          x
      }
    ) %>%
    ## Format the Outcome column as numbers
    colformat_num(j = "Outcome") %>%
    ## Set header labels for the table
    set_header_labels(
      Outcome = "Outcome",
      AM = "Actual-to-Model"
    ) %>%
    autofit() # %>%
    ## Print the flextable
    #knitr::knit_print()
}) %>%
  ## Set names for each table based on the factor columns
  purrr::set_names(factor_cols) -> 
  output_tables

if(output_format == "html") {
  output_tables %>%
    ## Print the flextable
    map(.f=knitr::knit_print) %>%
    ## Generate a tabset from the list of tables
    generate_tabset(
      tabtitle = "",
      tablevel = 5
    ) %>%
    ## Print the generated tabset
    cat()  
} else {
  export_tables_to_excel(
    output_tables,
    paste0(
        exportsRoot,
        "_glm_univariate_goodness_of_fit_tables.xlsx"
      )
  )
  
  cat("See included Excel table for additional information.\n")
}
uw

uw

Outcome

Actual-to-Model

N/1/1

19,203,624,009

100.0%

N/2/1

9,661,768,053

100.0%

N/2/2

11,546,531,146

100.0%

N/3/1

5,950,371,727

100.0%

N/3/2

8,810,968,555

100.0%

N/3/3

14,970,350,057

100.0%

N/4/1

6,842,959,121

100.0%

N/4/2

5,162,511,227

100.0%

N/4/3

3,565,224,892

100.0%

N/4/4

3,742,908,023

100.0%

S/1/1

4,365,025,076

100.0%

S/2/1

1,910,521,866

100.0%

S/2/2

1,704,933,729

100.0%

U/1/1

498,038,979

100.0%

face_amount_band

face_amount_band

Outcome

Actual-to-Model

01 - 0 - 49,999

4,193,809,072

100.0%

04 - 50,000 - 99,999

5,283,024,574

100.0%

05 - 100,000 - 249,999

16,241,007,604

100.0%

06 - 250,000 - 499,999

14,745,604,133

100.0%

07 - 500,000 - 999,999

15,180,744,107

100.0%

08 - 1,000,000+

42,291,546,970

100.0%

dur_band1

dur_band1

Outcome

Actual-to-Model

01

1,269,136,295

100.0%

02

1,852,767,081

100.0%

03

2,494,157,892

100.0%

04-05

6,321,445,332

100.0%

06-15

55,677,490,288

100.0%

16-25

30,320,739,572

100.0%

ia_band1

ia_band1

Outcome

Actual-to-Model

18-24

667,931,081

100.0%

25-34

6,267,527,815

100.0%

35-44

16,990,453,137

100.0%

45-54

20,629,510,315

100.0%

55-64

19,447,152,090

100.0%

65-74

16,608,837,516

100.0%

75-84

15,884,011,950

100.0%

85-99

1,440,312,556

100.0%

gender

gender

Outcome

Actual-to-Model

F

33,257,191,935

100.0%

M

64,678,544,525

100.0%

insurance_plan

insurance_plan

Outcome

Actual-to-Model

Other

187,352,300

100.0%

Perm

12,161,923,261

100.0%

Term

39,023,616,053

100.0%

xL

46,562,844,846

100.0%

ltp

ltp

Outcome

Actual-to-Model

5 yr

686,343,514

100.0%

10 yr

7,271,664,723

100.0%

15 yr

6,017,756,560

100.0%

20 yr

17,767,048,607

100.0%

25 yr

525,222,922

100.0%

30 yr

3,092,731,785

100.0%

Not Level Term

59,614,315,535

100.0%

Unknown

2,960,652,814

100.0%

iy_band1

iy_band1

Outcome

Actual-to-Model

1900-1989

1,464,179,514

100.0%

1990-1999

29,544,147,346

100.0%

2000-2009

55,045,293,308

100.0%

2010+

11,882,116,292

100.0%

Bivariate Goodness-of-Fit
## Generate a list of unique pairs of factor columns
pairlist <- data.table()
for (i in 1:(length(factor_cols) - 1)) {
  for (j in (i + 1):length(factor_cols)) {
    ## Initialize or append to the pairlist
    if (i == 1 & j == 2) {
      pairlist <- data.table(F1 = factor_cols[i], F2 = factor_cols[j])
    } else {
      pairlist <- rbind(pairlist, data.table(F1 = factor_cols[i], F2 = factor_cols[j]))
    }
  }
}

## Generate and format summary tables for each pair of factor columns
map2(.x = pairlist$F1, .y = pairlist$F2, .f = \(x, y) {
  xs <- sym(x)
  ys <- sym(y)
  
  ## Choose grouping order based on the number of levels in each factor
  if (length(train[, levels(get(x))]) >= length(train[, levels(get(y))])) {
    fttmp <- train %>%
      group_by(!!xs, !!ys) %>%
      summarize(
        Outcome = sum(amount_actual),
        Ratio = sprintf("%.1f%%", 100 * sum(amount_actual) / sum(predictions_glm))
      ) %>%
      pivot_wider(
        names_from = !!ys,
        values_from = c(Outcome, Ratio),
        names_glue = paste0(y, ": {", y, "}.{.value}"),
        names_vary = "slowest"
      )
  } else {
    fttmp <- train %>%
      group_by(!!ys, !!xs) %>%
      summarize(
        Outcome = sum(amount_actual),
        Ratio = sprintf("%.1f%%", 100 * sum(amount_actual) / sum(predictions_glm))
      ) %>%
      pivot_wider(
        names_from = !!xs,
        values_from = c(Outcome, Ratio),
        names_glue = paste0(x, ": {", x, "}.{.value}"),
        names_vary = "slowest"
      )
  }
  
  ## Adjust column keys for the flextable
  fttmp.colkeys <- names(fttmp)[1]
  for (i in 1:((length(names(fttmp)) - 1) / 2)) {
    fttmp.colkeys <- c(fttmp.colkeys, paste0("blank", i), names(fttmp)[(2 * i):(2 * i + 1)])
  }
  
  ## Create and print the flextable
  fttmp %>%
    flextable(col_keys = fttmp.colkeys) %>%
    ftExtra::span_header(sep = "\\.") %>%
    align(align = 'center', part = "all") %>%
    empty_blanks() %>%
    autofit() #%>%
    #knitr::knit_print()
}) %>%
  ## Set names for each element in the list based on the factor column pairs
  purrr::set_names(pairlist[, paste0(F1, " x ", F2)]) ->
  output_tables

if(output_format == "html") {
  output_tables %>%
    map(.f=knitr::knit_print) %>%
    ## Generate a tabset from the list of formatted tables
    generate_tabset(tabtitle = "", tablevel = 5) %>%
    ## Print the generated tabset
    cat()
} else {
  export_tables_to_excel(
    output_tables,
    paste0(
        exportsRoot,
        "_glm_bivariate_goodness_of_fit_tables.xlsx"
      )
  )
  
  cat("See included Excel table for additional information.\n")
}
uw x face_amount_band

uw

face_amount_band: 01 - 0 - 49,999

face_amount_band: 04 - 50,000 - 99,999

face_amount_band: 05 - 100,000 - 249,999

face_amount_band: 06 - 250,000 - 499,999

face_amount_band: 07 - 500,000 - 999,999

face_amount_band: 08 - 1,000,000+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

2,331,609,989

99.5%

2,665,573,911

99.3%

4,377,227,666

98.0%

2,484,222,240

97.0%

2,280,087,941

97.1%

5,064,902,262

105.6%

N/2/1

67,510,912

104.4%

227,491,364

103.6%

1,877,102,924

100.7%

1,581,998,866

100.7%

1,613,477,143

101.9%

4,294,186,844

98.5%

N/2/2

478,620,100

97.9%

716,716,216

101.7%

2,460,749,546

101.5%

1,807,687,840

99.4%

1,536,284,872

98.7%

4,546,472,572

99.9%

N/3/1

4,589,248

123.5%

36,548,841

109.3%

636,379,622

103.7%

1,099,392,788

101.8%

1,269,934,939

99.7%

2,903,526,289

98.6%

N/3/2

15,879,147

108.5%

84,477,290

113.1%

824,647,078

103.7%

1,194,151,445

104.1%

1,294,166,133

101.7%

5,397,647,462

98.0%

N/3/3

89,078,013

102.7%

266,651,094

95.5%

1,685,052,988

100.4%

1,897,715,243

100.0%

1,940,524,074

101.9%

9,091,328,645

99.6%

N/4/1

2,890,726

85.7%

23,168,994

98.8%

533,862,585

98.0%

1,152,655,538

101.6%

1,611,064,623

98.0%

3,519,316,655

100.8%

N/4/2

5,630,152

92.5%

43,592,461

100.9%

580,655,929

100.2%

959,152,506

106.4%

1,142,130,198

101.0%

2,431,349,981

97.2%

N/4/3

2,758,598

97.9%

19,493,024

95.6%

410,982,168

104.1%

652,807,204

97.6%

822,151,439

106.1%

1,657,032,459

97.3%

N/4/4

16,764,842

89.3%

69,558,070

97.5%

501,578,433

96.3%

668,382,546

96.0%

698,077,315

99.6%

1,788,546,817

103.1%

S/1/1

846,343,572

100.2%

864,666,577

100.8%

1,161,486,983

98.4%

510,128,489

98.7%

361,498,469

88.0%

620,900,986

112.0%

S/2/1

21,486,405

106.5%

85,321,233

108.0%

614,066,701

103.4%

400,530,746

99.9%

315,167,594

101.5%

473,949,187

93.6%

S/2/2

74,338,988

105.5%

135,934,467

97.3%

492,786,516

99.0%

294,838,420

97.5%

267,040,410

113.8%

439,994,928

95.6%

U/1/1

236,308,380

105.1%

43,831,032

84.9%

84,428,465

95.8%

41,940,262

87.0%

29,138,957

82.8%

62,391,883

124.6%

uw x dur_band1

uw

dur_band1: 01

dur_band1: 02

dur_band1: 03

dur_band1: 04-05

dur_band1: 06-15

dur_band1: 16-25

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

107,404,206

95.6%

127,611,866

107.1%

186,488,501

112.4%

495,415,170

119.6%

4,962,783,774

101.2%

13,323,920,492

98.8%

N/2/1

38,939,578

65.2%

117,924,754

143.0%

103,706,358

94.4%

187,685,672

70.1%

4,423,311,154

96.1%

4,790,200,537

105.6%

N/2/2

68,820,764

105.6%

89,429,859

102.0%

153,874,204

132.7%

321,513,019

104.9%

6,248,105,591

98.1%

4,664,787,709

101.4%

N/3/1

91,917,216

87.5%

162,393,895

101.4%

239,046,418

113.4%

463,263,819

94.1%

4,110,289,163

102.1%

883,461,216

92.5%

N/3/2

73,068,462

88.3%

142,447,178

110.4%

172,423,532

97.4%

471,435,974

96.3%

7,112,246,310

100.0%

839,347,099

101.9%

N/3/3

209,144,931

114.7%

274,764,174

100.2%

374,385,325

102.6%

1,050,466,173

109.0%

12,379,149,521

99.7%

682,439,933

88.3%

N/4/1

174,316,019

96.2%

228,792,240

79.4%

389,498,840

97.6%

1,034,633,390

104.2%

4,624,074,611

100.7%

391,644,021

100.5%

N/4/2

115,650,082

104.0%

173,110,541

98.6%

196,392,926

78.7%

696,533,900

100.7%

3,717,590,891

101.6%

263,232,887

95.8%

N/4/3

97,264,439

82.8%

192,504,643

110.5%

206,119,588

89.9%

501,287,156

91.0%

2,450,944,955

103.1%

117,104,111

101.4%

N/4/4

153,464,379

111.5%

195,703,696

94.8%

283,447,385

103.2%

660,958,667

97.3%

2,334,500,075

101.7%

114,833,821

76.4%

S/1/1

30,145,775

147.3%

28,847,772

106.8%

40,393,654

116.2%

91,257,938

102.5%

1,121,084,058

104.1%

3,053,295,879

98.0%

S/2/1

42,753,182

111.3%

58,621,979

100.4%

74,856,996

95.9%

177,694,957

94.5%

1,089,203,131

94.4%

467,391,621

118.9%

S/2/2

55,705,006

124.3%

55,852,780

86.4%

68,480,349

87.3%

160,693,564

87.0%

985,575,007

99.4%

378,627,023

111.0%

U/1/1

10,542,256

100.1%

4,761,704

82.7%

5,043,816

87.2%

8,605,933

76.0%

118,632,047

114.3%

350,453,223

97.1%

uw x ia_band1

uw

ia_band1: 18-24

ia_band1: 25-34

ia_band1: 35-44

ia_band1: 45-54

ia_band1: 55-64

ia_band1: 65-74

ia_band1: 75-84

ia_band1: 85-99

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

195,107,686

96.3%

1,468,766,511

91.2%

3,219,766,527

91.3%

3,573,783,086

93.9%

4,457,649,299

103.5%

4,148,815,415

106.2%

1,980,748,076

115.6%

158,987,409

120.4%

N/2/1

87,770,402

105.3%

804,579,965

97.8%

1,676,447,802

93.7%

1,660,839,957

94.5%

1,585,401,151

96.2%

2,098,151,266

103.6%

1,649,244,265

111.5%

99,333,245

172.7%

N/2/2

82,359,855

103.6%

734,969,884

107.3%

1,537,827,224

100.9%

1,935,817,830

104.4%

1,907,603,917

99.1%

2,320,063,956

98.1%

2,742,077,451

97.5%

285,811,029

94.1%

N/3/1

42,971,227

94.3%

542,310,521

103.9%

1,509,846,536

99.2%

1,517,047,115

99.4%

1,083,241,294

99.5%

682,592,797

96.6%

557,962,237

106.1%

14,400,000

104.5%

N/3/2

26,380,775

131.2%

339,627,773

112.9%

1,177,249,350

110.7%

1,552,812,212

104.2%

1,501,263,082

98.8%

1,589,866,999

94.0%

2,421,887,776

94.6%

201,880,588

122.9%

N/3/3

29,150,024

93.3%

391,634,706

111.4%

1,364,327,286

116.9%

1,932,181,705

105.6%

2,306,037,454

102.7%

2,995,484,451

100.0%

5,412,062,505

94.2%

539,471,926

89.2%

N/4/1

26,287,124

77.6%

602,385,596

102.4%

2,097,046,161

105.1%

2,253,038,301

102.5%

1,386,295,039

92.5%

431,665,885

96.8%

43,983,146

55.2%

2,257,869

129.5%

N/4/2

9,905,787

101.6%

233,732,519

96.1%

1,112,629,077

104.3%

1,595,224,194

97.4%

1,413,399,308

100.2%

448,165,240

84.4%

313,744,476

145.1%

35,710,626

76.9%

N/4/3

7,030,001

83.1%

209,711,793

120.4%

705,969,366

99.8%

1,147,207,102

104.5%

986,258,205

98.7%

356,988,649

88.4%

140,290,698

90.0%

11,769,078

65.1%

N/4/4

9,949,832

98.6%

167,099,067

107.9%

638,256,939

107.0%

1,003,674,951

104.2%

958,007,317

95.6%

551,614,283

95.5%

345,232,117

97.4%

69,073,517

82.5%

S/1/1

68,766,581

104.2%

451,737,170

93.4%

1,077,472,651

95.2%

1,226,674,054

96.4%

936,351,195

110.9%

461,642,247

100.8%

139,279,941

131.2%

3,101,237

80.4%

S/2/1

26,876,275

144.2%

179,096,556

93.2%

474,899,912

95.3%

608,218,812

101.6%

402,893,657

101.2%

174,002,125

107.8%

42,414,641

99.9%

2,119,888

278.7%

S/2/2

17,471,313

129.9%

117,414,041

101.0%

333,323,589

97.3%

526,461,474

103.9%

400,538,301

90.6%

233,737,976

109.2%

63,750,567

101.0%

12,236,468

183.8%

U/1/1

37,904,199

83.5%

24,461,713

107.5%

65,390,717

112.1%

96,529,522

103.3%

122,212,871

105.3%

116,046,227

91.7%

31,334,054

96.9%

4,159,676

132.3%

uw x gender

uw

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

N/1/1

6,871,313,340

100.8%

12,332,310,669

99.6%

N/2/1

3,739,559,716

101.2%

5,922,208,337

99.3%

N/2/2

4,668,956,646

100.5%

6,877,574,500

99.6%

N/3/1

1,982,219,430

100.8%

3,968,152,297

99.6%

N/3/2

3,036,089,596

97.7%

5,774,878,959

101.3%

N/3/3

5,764,067,198

101.9%

9,206,282,859

98.8%

N/4/1

1,822,625,798

98.2%

5,020,333,323

100.7%

N/4/2

997,342,369

90.9%

4,165,168,858

102.5%

N/4/3

561,153,990

89.4%

3,004,070,902

102.3%

N/4/4

894,420,215

96.9%

2,848,487,808

101.0%

S/1/1

1,618,467,456

102.5%

2,746,557,620

98.6%

S/2/1

534,567,286

96.4%

1,375,954,580

101.5%

S/2/2

553,360,607

107.2%

1,151,573,122

96.9%

U/1/1

213,048,288

98.0%

284,990,691

101.5%

uw x insurance_plan

uw

insurance_plan: Other

insurance_plan: Perm

insurance_plan: Term

insurance_plan: xL

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

41,338,662

134.5%

6,000,364,392

97.6%

3,545,626,085

89.4%

9,616,294,870

106.2%

N/2/1

10,081,665

86.4%

1,335,375,364

96.1%

2,232,053,273

96.2%

6,084,257,751

102.4%

N/2/2

15,784,451

80.0%

1,659,464,730

101.6%

3,001,480,499

109.6%

6,869,801,466

96.0%

N/3/1

40,123,942

295.6%

317,915,492

140.1%

3,500,287,255

96.0%

2,092,045,038

101.5%

N/3/2

20,508,375

105.8%

282,009,643

131.8%

3,266,885,153

105.3%

5,241,565,384

95.7%

N/3/3

32,085,711

110.3%

496,907,377

134.1%

4,052,652,325

100.8%

10,388,704,644

98.5%

N/4/1

2,300,000

31.2%

9,330,053

136.5%

6,251,798,169

101.0%

579,530,899

90.5%

N/4/2

7,223,579

59.0%

7,940,683

100.1%

4,382,666,145

100.3%

764,680,820

98.9%

N/4/3

103,684

1.9%

2,495,706

66.5%

3,036,110,841

101.6%

526,514,661

93.0%

N/4/4

3,739,130

20.3%

12,550,388

101.1%

2,742,467,039

101.1%

984,151,466

98.5%

S/1/1

3,801,542

137.6%

1,397,404,613

90.5%

1,062,735,519

108.1%

1,901,083,402

103.6%

S/2/1

6,544,922

105.9%

163,842,775

103.1%

1,131,185,266

99.2%

608,948,903

100.7%

S/2/2

3,059,454

31.6%

189,503,330

110.6%

778,511,625

97.7%

733,859,320

100.9%

U/1/1

657,183

67.3%

286,818,715

105.3%

39,156,859

82.2%

171,406,222

96.7%

uw x ltp

uw

ltp: 5 yr

ltp: 10 yr

ltp: 15 yr

ltp: 20 yr

ltp: 25 yr

ltp: 30 yr

ltp: Not Level Term

ltp: Unknown

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

167,693,145

109.1%

329,259,367

87.7%

313,430,210

105.4%

1,254,368,612

83.3%

22,798,990

117.5%

140,376,344

88.0%

15,709,773,948

102.8%

1,265,923,393

89.9%

N/2/1

109,944,549

104.3%

251,674,560

102.1%

154,929,926

135.1%

1,003,102,210

89.8%

54,969,322

93.3%

91,526,385

94.7%

7,455,632,156

101.2%

539,988,945

97.8%

N/2/2

101,292,185

97.6%

464,466,296

118.6%

319,758,321

113.0%

1,222,999,988

106.1%

275,278,999

105.5%

237,099,784

102.1%

8,573,983,079

96.9%

351,652,494

127.0%

N/3/1

41,020,000

109.2%

402,661,935

102.2%

437,170,279

100.8%

1,876,102,407

92.5%

32,892,500

81.8%

371,739,248

91.5%

2,700,444,573

106.6%

88,340,785

115.1%

N/3/2

62,023,000

152.0%

523,988,195

109.0%

476,579,573

99.7%

1,824,845,270

105.8%

12,110,000

59.7%

210,771,725

91.7%

5,612,039,277

97.2%

88,611,515

141.4%

N/3/3

59,980,899

76.7%

896,107,443

101.3%

652,611,790

96.4%

1,842,591,351

104.0%

16,475,000

92.4%

201,629,249

93.7%

11,161,374,076

99.6%

139,580,249

120.4%

N/4/1

4,410,000

189.8%

1,142,209,199

105.5%

1,124,192,978

94.8%

3,080,000,938

101.6%

36,372,000

125.8%

845,189,202

101.3%

591,162,804

90.3%

19,422,000

84.8%

N/4/2

985,000

47.5%

899,393,774

97.8%

918,062,758

97.6%

2,118,167,322

102.4%

14,432,100

97.5%

414,673,900

103.8%

779,845,082

98.3%

16,951,291

73.2%

N/4/3

300,000

40.4%

711,850,279

85.6%

663,787,225

104.9%

1,348,193,434

106.8%

8,775,000

155.8%

270,343,655

116.8%

529,114,051

91.9%

32,861,248

130.8%

N/4/4

1,900,000

70.6%

861,440,026

96.0%

549,878,289

100.2%

1,117,119,304

106.1%

8,832,928

95.4%

186,021,492

107.1%

1,000,440,984

97.2%

17,275,000

61.2%

S/1/1

77,649,707

94.9%

220,477,191

117.0%

92,140,018

95.6%

325,974,726

108.2%

37,348,584

91.1%

43,126,718

114.7%

3,324,217,468

97.7%

244,090,664

113.2%

S/2/1

35,606,411

76.5%

312,687,204

100.7%

190,844,063

101.1%

450,793,441

99.4%

2,925,000

103.9%

50,493,377

107.2%

784,528,034

101.3%

82,644,336

95.5%

S/2/2

19,448,975

76.3%

251,942,737

95.5%

122,991,130

92.6%

294,652,404

102.4%

1,050,000

59.2%

28,434,706

103.2%

932,515,030

101.8%

53,898,747

108.9%

U/1/1

4,089,643

79.6%

3,506,517

78.9%

1,380,000

19.3%

8,137,200

103.2%

962,499

27.8%

1,306,000

73.8%

459,244,973

101.9%

19,412,147

111.7%

uw x iy_band1

uw

iy_band1: 1900-1989

iy_band1: 1990-1999

iy_band1: 2000-2009

iy_band1: 2010+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

982,160,538

101.0%

12,534,732,161

98.5%

4,842,664,276

101.9%

844,067,034

111.3%

N/2/1

112,802,132

111.6%

4,655,974,027

105.8%

4,443,736,468

95.7%

449,255,426

86.6%

N/2/2

107,232,938

97.9%

4,512,424,274

101.4%

6,312,557,239

98.2%

614,316,695

110.2%

N/3/1

282,000

25.0%

1,044,203,844

94.0%

3,887,317,728

101.0%

1,018,568,155

102.9%

N/3/2

250,000

15.7%

1,008,204,301

108.1%

6,934,988,157

99.2%

867,526,097

98.0%

N/3/3

4,055,561

29.7%

755,727,777

89.7%

12,308,187,396

100.0%

1,902,379,323

105.0%

N/4/1

0

0.0%

394,664,512

92.8%

4,606,160,597

101.2%

1,842,134,012

98.8%

N/4/2

308,834,925

92.1%

3,702,484,884

102.2%

1,151,191,418

95.6%

N/4/3

158,854,990

106.6%

2,386,096,189

101.6%

1,020,273,713

95.6%

N/4/4

0

0.0%

146,243,472

85.1%

2,314,552,528

101.3%

1,282,112,023

99.7%

S/1/1

212,607,764

96.3%

2,836,569,692

97.3%

1,124,908,213

106.3%

190,939,407

111.3%

S/2/1

7,889,930

105.5%

479,056,419

117.8%

1,077,182,521

95.0%

346,392,996

95.7%

S/2/2

7,665,540

118.1%

388,469,845

110.5%

986,492,416

100.9%

322,305,928

87.3%

U/1/1

29,233,111

102.7%

320,187,107

96.6%

117,964,696

113.5%

30,654,065

89.3%

face_amount_band x dur_band1

face_amount_band

dur_band1: 01

dur_band1: 02

dur_band1: 03

dur_band1: 04-05

dur_band1: 06-15

dur_band1: 16-25

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

01 - 0 - 49,999

37,859,834

206.5%

46,746,236

170.9%

59,349,580

153.9%

129,333,013

132.3%

856,334,359

114.3%

3,064,186,050

93.9%

04 - 50,000 - 99,999

52,511,061

154.8%

66,787,618

135.5%

90,269,149

137.0%

206,424,919

123.5%

1,469,057,939

106.5%

3,397,973,888

94.7%

05 - 100,000 - 249,999

191,101,070

116.3%

269,652,985

110.7%

371,980,209

113.2%

855,544,316

103.2%

7,074,965,104

102.1%

7,477,763,920

96.6%

06 - 250,000 - 499,999

209,469,031

101.1%

304,453,835

99.9%

416,690,820

102.2%

1,000,438,105

98.2%

8,109,440,540

102.1%

4,705,111,802

96.7%

07 - 500,000 - 999,999

215,479,316

89.6%

350,555,118

99.6%

461,397,659

96.9%

1,167,345,786

98.3%

8,791,983,362

100.6%

4,193,982,866

100.3%

08 - 1,000,000+

562,715,983

93.0%

814,571,289

93.0%

1,094,470,475

93.0%

2,962,359,193

98.1%

29,375,708,984

98.1%

7,481,721,046

112.0%

face_amount_band x ia_band1

ia_band1

face_amount_band: 01 - 0 - 49,999

face_amount_band: 04 - 50,000 - 99,999

face_amount_band: 05 - 100,000 - 249,999

face_amount_band: 06 - 250,000 - 499,999

face_amount_band: 07 - 500,000 - 999,999

face_amount_band: 08 - 1,000,000+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

18-24

75,552,784

113.3%

133,231,819

104.9%

220,299,419

96.4%

115,618,404

90.3%

61,754,428

95.5%

61,474,227

115.8%

25-34

192,799,300

135.4%

458,412,108

108.4%

1,476,938,558

99.2%

1,398,298,659

97.1%

1,378,321,244

99.1%

1,362,757,946

98.6%

35-44

421,318,794

111.8%

898,534,810

103.5%

3,004,123,357

98.4%

3,258,823,688

98.7%

3,627,163,035

96.5%

5,780,489,453

102.6%

45-54

861,129,196

105.6%

1,234,583,452

103.1%

4,027,597,384

100.6%

3,776,175,782

98.0%

3,750,396,726

96.9%

6,979,627,775

101.3%

55-64

1,371,789,823

97.4%

1,302,388,605

98.1%

4,041,501,912

100.2%

3,254,795,121

100.4%

2,876,205,825

98.5%

6,600,470,804

101.2%

65-74

1,035,580,594

89.8%

950,356,588

92.1%

2,489,528,859

99.8%

1,929,056,738

103.8%

2,107,668,820

107.8%

8,096,645,917

99.8%

75-84

221,626,122

101.6%

285,327,929

97.6%

901,837,477

103.1%

915,794,848

108.8%

1,257,117,333

112.8%

12,302,308,241

98.1%

85-99

14,012,459

119.7%

20,189,263

120.5%

79,180,638

118.3%

97,040,893

115.0%

122,116,696

115.6%

1,107,772,607

95.9%

face_amount_band x gender

face_amount_band

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

01 - 0 - 49,999

2,082,160,405

95.0%

2,111,648,667

105.4%

04 - 50,000 - 99,999

2,075,208,647

94.5%

3,207,815,927

103.9%

05 - 100,000 - 249,999

5,592,847,725

95.7%

10,648,159,879

102.4%

06 - 250,000 - 499,999

4,715,664,348

99.9%

10,029,939,785

100.1%

07 - 500,000 - 999,999

4,435,784,376

102.6%

10,744,959,731

99.0%

08 - 1,000,000+

14,355,526,434

102.7%

27,936,020,536

98.7%

face_amount_band x insurance_plan

face_amount_band

insurance_plan: Other

insurance_plan: Perm

insurance_plan: Term

insurance_plan: xL

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

01 - 0 - 49,999

4,987,841

93.2%

2,341,324,220

98.6%

430,107,626

122.3%

1,417,389,385

97.0%

04 - 50,000 - 99,999

5,666,783

99.9%

1,269,348,289

93.3%

1,013,673,047

124.2%

2,994,336,455

96.6%

05 - 100,000 - 249,999

34,041,514

117.9%

2,315,213,211

95.1%

6,874,173,321

102.8%

7,017,579,558

99.0%

06 - 250,000 - 499,999

28,639,038

138.2%

1,510,214,779

94.3%

8,348,184,153

99.6%

4,858,566,163

102.4%

07 - 500,000 - 999,999

18,132,026

99.2%

1,434,005,349

91.2%

8,575,363,506

97.6%

5,153,243,226

107.3%

08 - 1,000,000+

95,885,098

88.4%

3,291,817,413

116.8%

13,782,114,400

98.4%

25,121,730,059

99.0%

face_amount_band x ltp

ltp

face_amount_band: 01 - 0 - 49,999

face_amount_band: 04 - 50,000 - 99,999

face_amount_band: 05 - 100,000 - 249,999

face_amount_band: 06 - 250,000 - 499,999

face_amount_band: 07 - 500,000 - 999,999

face_amount_band: 08 - 1,000,000+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

5 yr

40,106,609

99.0%

108,758,076

109.8%

223,852,354

96.5%

134,040,438

98.9%

88,356,037

106.2%

91,230,000

95.0%

10 yr

79,471,212

114.9%

152,307,148

128.4%

1,212,386,424

106.6%

1,312,587,882

102.0%

1,417,042,517

102.6%

3,097,869,540

94.5%

15 yr

63,154,393

123.8%

171,454,282

118.5%

1,115,915,003

101.2%

1,284,208,527

99.8%

1,254,669,304

98.2%

2,128,355,051

98.8%

20 yr

113,505,987

126.9%

278,179,336

125.6%

2,893,988,498

100.6%

3,894,785,171

99.7%

4,168,688,693

97.7%

6,417,900,922

100.1%

25 yr

9,580,391

167.3%

26,792,196

146.7%

154,206,000

105.4%

162,940,708

96.8%

101,887,627

87.9%

69,816,000

98.8%

30 yr

6,329,793

185.7%

31,267,129

163.5%

436,286,383

109.2%

721,667,111

98.6%

809,282,434

96.0%

1,087,898,935

99.3%

Not Level Term

3,802,855,700

98.1%

4,317,239,186

95.6%

9,477,900,932

98.0%

6,615,106,546

100.5%

6,758,480,601

103.0%

28,642,732,570

100.8%

Unknown

78,804,987

138.5%

197,027,221

133.2%

726,472,010

107.1%

620,267,750

95.4%

582,336,894

89.1%

755,743,952

97.7%

face_amount_band x iy_band1

face_amount_band

iy_band1: 1900-1989

iy_band1: 1990-1999

iy_band1: 2000-2009

iy_band1: 2010+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

01 - 0 - 49,999

267,683,109

88.3%

2,808,814,596

94.6%

833,122,239

113.9%

284,189,128

150.3%

04 - 50,000 - 99,999

267,216,839

93.3%

3,149,056,719

94.8%

1,444,089,389

106.9%

422,661,627

130.8%

05 - 100,000 - 249,999

404,345,538

94.5%

7,205,856,453

96.5%

6,896,526,088

102.1%

1,734,279,525

109.1%

06 - 250,000 - 499,999

158,023,865

94.6%

4,708,333,873

97.2%

7,919,827,744

101.7%

1,959,418,651

100.8%

07 - 500,000 - 999,999

136,967,257

113.5%

4,219,050,921

100.2%

8,611,625,509

100.4%

2,213,100,420

97.6%

08 - 1,000,000+

229,942,906

144.9%

7,453,034,784

110.8%

29,340,102,339

98.3%

5,268,466,941

94.6%

dur_band1 x ia_band1

ia_band1

dur_band1: 01

dur_band1: 02

dur_band1: 03

dur_band1: 04-05

dur_band1: 06-15

dur_band1: 16-25

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

18-24

38,014,843

88.5%

30,257,763

81.9%

27,602,930

89.9%

48,778,652

96.0%

247,658,415

121.0%

275,618,478

91.3%

25-34

134,511,109

103.4%

163,517,330

108.7%

194,704,716

111.7%

424,147,771

112.2%

2,807,372,903

100.3%

2,543,273,986

96.5%

35-44

290,877,481

135.5%

372,345,417

117.1%

502,388,214

117.5%

1,120,149,604

104.8%

8,625,261,989

101.2%

6,079,430,432

94.4%

45-54

284,754,759

87.7%

443,414,694

93.4%

668,187,442

103.0%

1,648,875,363

101.9%

10,888,781,617

102.0%

6,695,496,440

97.2%

55-64

316,638,372

94.6%

537,383,357

107.2%

610,988,925

88.1%

1,580,728,594

93.5%

9,237,741,705

99.1%

7,163,671,137

103.8%

65-74

143,303,417

80.8%

240,017,175

81.0%

343,292,965

88.5%

930,432,092

92.5%

8,689,000,890

98.9%

6,262,790,977

105.1%

75-84

45,633,095

118.8%

49,322,782

76.7%

121,205,864

107.0%

506,866,160

116.0%

13,883,785,714

98.6%

1,277,198,335

110.7%

85-99

15,403,219

239.9%

16,508,563

154.3%

25,786,836

139.8%

61,467,096

84.7%

1,297,887,055

100.3%

23,259,787

61.5%

dur_band1 x gender

dur_band1

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

01

322,942,506

112.1%

946,193,789

96.4%

02

424,637,814

88.4%

1,428,129,267

104.0%

03

697,046,249

100.6%

1,797,111,643

99.8%

04-05

1,788,538,549

94.6%

4,532,906,783

102.3%

06-15

19,652,776,350

100.3%

36,024,713,938

99.8%

16-25

10,371,250,467

100.6%

19,949,489,105

99.7%

dur_band1 x insurance_plan

dur_band1

insurance_plan: Other

insurance_plan: Perm

insurance_plan: Term

insurance_plan: xL

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

01

6,987,296

71.5%

115,387,649

144.0%

816,300,396

95.0%

330,460,954

103.3%

02

12,956,089

93.2%

186,759,670

170.6%

1,193,573,840

98.7%

459,477,482

88.3%

03

10,940,700

62.7%

173,260,241

123.2%

1,539,568,218

97.1%

770,388,733

102.6%

04-05

41,382,619

117.3%

351,202,533

111.1%

3,960,972,351

101.1%

1,967,887,829

95.9%

06-15

92,001,723

94.3%

2,251,458,889

110.9%

24,546,892,971

100.1%

28,787,136,705

99.2%

16-25

23,083,873

172.0%

9,083,854,279

95.8%

6,966,308,277

100.5%

14,247,493,143

102.6%

dur_band1 x ltp

ltp

dur_band1: 01

dur_band1: 02

dur_band1: 03

dur_band1: 04-05

dur_band1: 06-15

dur_band1: 16-25

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

5 yr

8,474,532

140.6%

5,768,265

70.3%

13,120,000

97.0%

47,280,625

102.7%

258,887,272

99.8%

352,812,820

99.9%

10 yr

250,039,805

87.6%

374,445,553

90.6%

463,240,176

85.9%

1,216,385,768

92.0%

4,556,704,077

103.6%

410,849,344

131.3%

15 yr

106,613,913

95.8%

154,795,986

92.5%

219,264,610

98.1%

520,402,972

92.2%

4,466,179,563

94.8%

550,499,516

229.5%

20 yr

314,198,709

105.4%

447,558,227

106.0%

600,176,032

106.3%

1,543,338,839

108.2%

11,599,090,681

100.9%

3,262,686,119

91.5%

25 yr

3,615,999

48.6%

13,707,800

138.9%

18,921,628

130.4%

49,784,528

143.2%

267,598,048

98.6%

171,594,919

91.7%

30 yr

66,053,409

84.9%

98,748,126

97.2%

130,459,928

98.9%

388,766,012

117.8%

1,983,373,230

102.1%

425,331,080

83.7%

Not Level Term

453,235,980

110.6%

659,812,994

102.4%

954,906,252

105.0%

2,361,669,130

98.2%

31,616,749,475

100.0%

23,567,941,704

99.8%

Unknown

66,903,948

91.4%

97,930,130

113.4%

94,069,266

96.1%

193,817,458

100.1%

928,907,942

95.6%

1,579,024,070

102.7%

dur_band1 x iy_band1

dur_band1

iy_band1: 2010+

iy_band1: 2000-2009

iy_band1: 1990-1999

iy_band1: 1900-1989

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

01

1,269,136,295

100.0%

02

1,852,767,081

100.0%

03

2,494,157,892

100.0%

04-05

4,479,996,408

98.5%

1,841,448,924

103.8%

06-15

1,786,058,616

103.9%

51,370,835,971

99.9%

2,520,595,701

98.9%

16-25

1,833,008,413

98.5%

27,023,551,645

100.1%

1,464,179,514

100.0%

ia_band1 x gender

ia_band1

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

18-24

210,351,040

88.4%

457,580,041

106.4%

25-34

1,919,168,906

93.6%

4,348,358,909

103.1%

35-44

4,376,241,525

95.6%

12,614,211,612

101.6%

45-54

4,611,772,525

96.3%

16,017,737,790

101.1%

55-64

4,809,017,158

99.7%

14,638,134,932

100.1%

65-74

7,381,394,530

102.0%

9,227,442,986

98.5%

75-84

9,045,279,898

104.3%

6,838,732,052

94.9%

85-99

903,966,353

104.7%

536,346,203

93.0%

ia_band1 x insurance_plan

ia_band1

insurance_plan: Other

insurance_plan: Perm

insurance_plan: Term

insurance_plan: xL

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

18-24

708,665

58.0%

166,230,095

92.6%

267,466,700

103.9%

233,525,621

101.6%

25-34

3,256,794

65.2%

852,327,846

101.0%

4,057,228,956

99.0%

1,354,714,219

102.7%

35-44

21,307,402

115.2%

1,846,365,837

95.1%

11,715,945,894

100.6%

3,406,834,004

100.6%

45-54

57,525,597

130.3%

2,565,363,967

96.2%

12,641,031,673

100.0%

5,365,589,078

101.7%

55-64

47,144,955

78.8%

3,306,501,193

98.1%

8,357,779,097

99.2%

7,735,726,845

101.9%

65-74

35,749,661

100.5%

2,847,699,847

105.8%

1,896,982,781

100.8%

11,828,405,227

98.6%

75-84

21,392,494

100.4%

492,087,214

117.0%

86,369,648

119.9%

15,284,162,594

99.4%

85-99

266,732

14.9%

85,347,262

178.0%

811,304

157.5%

1,353,887,258

97.4%

ia_band1 x ltp

ia_band1

ltp: 5 yr

ltp: 10 yr

ltp: 15 yr

ltp: 20 yr

ltp: 25 yr

ltp: 30 yr

ltp: Not Level Term

ltp: Unknown

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

18-24

17,726,831

84.0%

32,009,383

104.5%

3,343,000

50.1%

80,676,936

114.1%

13,002,500

99.2%

48,824,425

108.2%

403,156,010

97.8%

69,191,996

101.5%

25-34

147,704,951

101.6%

317,080,031

112.2%

150,181,284

111.1%

1,670,129,784

100.3%

164,417,165

95.1%

748,387,532

96.9%

2,274,993,951

101.8%

794,633,117

92.6%

35-44

232,956,798

94.0%

1,049,894,842

107.7%

983,755,234

111.0%

6,112,699,729

99.2%

222,687,940

94.4%

1,620,869,320

100.6%

5,508,205,763

99.0%

1,259,383,511

96.3%

45-54

178,555,603

105.5%

2,312,649,990

105.3%

2,156,659,786

94.8%

6,421,617,312

98.7%

95,336,910

117.1%

652,477,911

101.5%

8,212,889,839

99.9%

599,322,964

111.2%

55-64

76,414,209

100.5%

2,489,338,251

93.6%

2,056,964,977

97.3%

3,341,836,281

103.9%

27,176,407

137.6%

22,072,597

104.0%

11,252,197,601

100.5%

181,151,767

121.3%

65-74

27,015,122

109.9%

1,014,083,731

94.7%

662,777,279

111.1%

140,088,565

96.5%

2,602,000

131.2%

100,000

44.5%

14,725,682,019

99.9%

36,488,800

116.5%

75-84

5,970,000

287.3%

56,608,495

96.8%

4,065,000

99.0%

0

0.0%

15,797,689,100

99.9%

19,679,355

277.5%

85-99

0

0.0%

10,000

7067.2%

1,439,501,252

100.0%

801,304

159.5%

ia_band1 x iy_band1

ia_band1

iy_band1: 1900-1989

iy_band1: 1990-1999

iy_band1: 2000-2009

iy_band1: 2010+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

18-24

21,957,490

91.9%

248,374,326

89.6%

249,952,870

120.2%

147,646,395

93.0%

25-34

156,424,882

93.9%

2,415,663,468

97.2%

2,774,706,986

99.6%

920,732,479

110.8%

35-44

323,624,196

99.9%

5,875,528,027

94.0%

8,554,094,758

101.8%

2,237,206,156

111.3%

45-54

340,733,447

95.8%

6,605,000,254

97.1%

10,642,985,328

102.1%

3,040,791,286

99.7%

55-64

443,897,049

104.7%

6,912,220,124

103.2%

8,972,404,806

98.8%

3,118,630,111

96.0%

65-74

172,472,214

104.5%

6,127,207,226

105.3%

8,704,756,697

99.4%

1,604,401,379

85.8%

75-84

5,070,236

101.8%

1,325,459,599

112.9%

13,867,379,988

98.4%

686,102,127

112.7%

85-99

0

0.0%

34,694,322

83.1%

1,279,011,875

99.0%

126,606,359

119.4%

gender x insurance_plan

insurance_plan

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

Other

76,199,638

107.2%

111,152,662

95.6%

Perm

4,259,680,761

99.0%

7,902,242,500

100.6%

Term

8,746,195,310

96.4%

30,277,420,743

101.1%

xL

20,175,116,226

101.8%

26,387,728,620

98.6%

gender x ltp

ltp

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

5 yr

233,647,378

101.9%

452,696,136

99.0%

10 yr

1,294,444,561

102.3%

5,977,220,162

99.5%

15 yr

1,152,920,343

94.6%

4,864,836,217

101.4%

20 yr

4,067,960,544

95.2%

13,699,088,063

101.5%

25 yr

164,969,732

96.4%

360,253,190

101.7%

30 yr

911,017,939

93.9%

2,181,713,846

102.8%

Not Level Term

24,668,222,437

101.3%

34,946,093,098

99.1%

Unknown

764,009,001

96.9%

2,196,643,813

101.1%

gender x iy_band1

iy_band1

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

1900-1989

415,833,187

97.0%

1,048,346,327

101.2%

1990-1999

10,124,056,124

101.1%

19,420,091,222

99.4%

2000-2009

19,505,796,137

100.2%

35,539,497,171

99.9%

2010+

3,211,506,487

96.2%

8,670,609,805

101.5%

insurance_plan x ltp

ltp

insurance_plan: Term

insurance_plan: Other

insurance_plan: Perm

insurance_plan: xL

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

5 yr

686,343,514

100.0%

10 yr

7,271,664,723

100.0%

15 yr

6,017,756,560

100.0%

20 yr

17,767,048,607

100.0%

25 yr

525,222,922

100.0%

30 yr

3,092,731,785

100.0%

Not Level Term

702,195,128

100.0%

187,352,300

100.0%

12,161,923,261

100.0%

46,562,844,846

100.0%

Unknown

2,960,652,814

100.0%

insurance_plan x iy_band1

insurance_plan

iy_band1: 1900-1989

iy_band1: 1990-1999

iy_band1: 2000-2009

iy_band1: 2010+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Other

699,086

17.9%

24,035,150

230.6%

95,671,569

96.9%

66,946,495

90.2%

Perm

685,297,227

91.9%

8,414,471,562

96.2%

2,230,633,550

111.2%

831,520,922

125.3%

Term

110,349,436

110.0%

7,507,786,831

99.9%

24,023,250,728

100.4%

7,382,229,058

98.8%

xL

667,833,765

108.7%

13,597,853,803

102.5%

28,695,737,461

98.9%

3,601,419,817

98.1%

ltp x iy_band1

ltp

iy_band1: 1900-1989

iy_band1: 1990-1999

iy_band1: 2000-2009

iy_band1: 2010+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

5 yr

11,828,904

87.5%

356,634,986

100.8%

256,413,452

100.6%

61,466,172

96.0%

10 yr

7,313,258

124.1%

395,337,236

127.5%

4,556,557,084

103.2%

2,312,457,145

91.1%

15 yr

770,000

85.8%

786,184,143

141.4%

4,222,555,938

95.8%

1,008,246,479

95.9%

20 yr

27,216,551

136.9%

3,469,245,630

93.0%

11,456,767,739

101.0%

2,813,818,687

105.4%

25 yr

0

0.0%

162,532,838

88.4%

278,353,329

102.4%

84,336,755

121.4%

30 yr

0

0.0%

553,191,034

85.1%

1,883,712,240

103.7%

655,828,511

104.6%

Not Level Term

1,355,860,609

99.2%

22,279,426,292

100.0%

31,476,742,520

99.8%

4,502,286,114

102.1%

Unknown

61,190,192

106.3%

1,541,595,187

104.5%

914,191,006

93.7%

443,676,429

98.3%

Subgroup Variability

This section reproduces Brian Holland’s publication. For background on the tables generated below, please refer to the publication.

## Load custom functions
source("R/functions_BDHGLM.R")

## Generate and format summary tables for each factor column
factor_cols %>%
  map(.f = \(x) {
    ## Call the mainF function to compute weighted averages for GLM factors
    mainF(
      df = ds,
      model = modelGLM,
      rf = x,
      resp = resp_var,
      offset = resp_offset
    ) %>%
      ## Create a flextable from the results
      flextable() %>%
      ## Set header labels for the table
      set_header_labels("rowname" = "") %>%
      ## Format table values as percentages if numeric
      set_formatter(values = function(x) {
        if (is.numeric(x))
          sprintf("%.1f%%", x * 100)
        else
          x
      }) %>%
      ## Set a caption for the table
      set_caption(caption = paste0("Weighted Average GLM Factors for Variable: ", x)) %>%
      ## Set table properties to enable scrolling
      set_table_properties(opts_html = list(
        scroll = list(
          add_css = "max-height: 500px;"
        )
      )) %>%
      autofit() #%>%
      ## Print the flextable
      #knitr::knit_print()
  }) %>%
  ## Set names for each table based on the factor columns
  purrr::set_names(factor_cols) ->
  output_tables 

if(output_format == "html") {
  output_tables %>%
    map(.f=knitr::knit_print) %>%
    ## Generate a tabset from the list of tables
    generate_tabset(
      tabtitle = "Tables of Terms",
      tablevel = 4
    ) %>%
    ## Print the generated tabset
    cat()
} else {
  export_tables_to_excel(
    output_tables,
    paste0(
        exportsRoot,
        "_glm_subgroup_variability.xlsx"
      )
  )
  cat("See included Excel table for additional information.\n")
}
uw
Weighted Average GLM Factors for Variable: uw

N/1/1

N/2/1

N/2/2

N/3/1

N/3/2

N/3/3

N/4/1

N/4/2

N/4/3

N/4/4

S/1/1

S/2/1

S/2/2

U/1/1

amount_2015vbt

97.2%

78.0%

116.0%

65.3%

72.0%

98.7%

64.1%

76.0%

92.4%

110.2%

111.6%

79.6%

108.8%

139.2%

Factor: uw

100.0%

85.9%

128.2%

72.4%

83.6%

116.4%

71.9%

84.7%

101.1%

123.1%

107.0%

83.6%

111.9%

129.8%

Ave Fac: dur_band1

90.7%

90.4%

90.2%

90.7%

90.1%

90.1%

91.4%

91.1%

91.6%

91.9%

90.7%

91.1%

91.3%

91.2%

Ave Fac: face_amount_band

78.0%

73.3%

74.8%

72.5%

72.1%

72.5%

72.3%

72.6%

72.6%

73.0%

81.4%

75.0%

76.1%

86.4%

Ave Fac: gender

100.7%

100.7%

100.6%

100.7%

100.7%

100.7%

100.8%

100.8%

100.9%

100.8%

100.7%

100.8%

100.7%

100.6%

Ave Fac: ia_band1

91.6%

91.3%

89.8%

92.2%

89.5%

88.1%

93.1%

92.1%

92.1%

91.1%

92.9%

92.9%

92.3%

92.4%

Ave Fac: insurance_plan

88.8%

89.9%

90.3%

83.7%

89.2%

90.9%

78.0%

79.4%

79.5%

82.1%

88.1%

83.5%

85.9%

89.0%

Ave Fac: iy_band1

90.6%

88.8%

88.4%

86.5%

86.3%

85.9%

85.2%

85.4%

84.9%

84.7%

90.8%

86.5%

86.4%

90.6%

Ave Fac: ltp

71.1%

71.1%

71.1%

76.5%

72.8%

71.4%

83.3%

82.4%

82.8%

80.6%

71.6%

78.3%

76.0%

69.5%

face_amount_band
Weighted Average GLM Factors for Variable: face_amount_band

01 - 0 - 49,999

04 - 50,000 - 99,999

05 - 100,000 - 249,999

06 - 250,000 - 499,999

07 - 500,000 - 999,999

08 - 1,000,000+

amount_2015vbt

132.1%

118.6%

98.9%

88.2%

83.2%

80.4%

Factor: face_amount_band

100.0%

89.3%

79.0%

74.2%

72.5%

70.6%

Ave Fac: dur_band1

90.8%

90.8%

90.8%

90.8%

90.8%

90.5%

Ave Fac: gender

100.5%

100.6%

100.7%

100.7%

100.8%

100.7%

Ave Fac: ia_band1

91.6%

92.1%

92.2%

92.4%

92.4%

89.6%

Ave Fac: insurance_plan

89.5%

90.6%

86.2%

83.3%

83.0%

88.5%

Ave Fac: iy_band1

91.4%

90.5%

88.8%

87.6%

87.1%

86.5%

Ave Fac: ltp

68.6%

70.0%

74.7%

77.2%

77.5%

73.4%

Ave Fac: uw

105.6%

104.0%

99.3%

95.4%

93.2%

95.5%

dur_band1
Weighted Average GLM Factors for Variable: dur_band1

01

02

03

04-05

06-15

16-25

amount_2015vbt

90.4%

90.8%

87.3%

85.3%

82.4%

98.9%

Factor: dur_band1

100.0%

101.0%

100.0%

95.7%

89.1%

90.9%

Ave Fac: face_amount_band

73.1%

73.1%

73.1%

73.1%

72.8%

77.4%

Ave Fac: gender

100.8%

100.8%

100.8%

100.7%

100.7%

100.7%

Ave Fac: ia_band1

92.6%

92.4%

92.2%

91.9%

90.1%

92.5%

Ave Fac: insurance_plan

81.9%

82.4%

82.7%

83.1%

86.7%

88.0%

Ave Fac: iy_band1

81.4%

81.4%

81.4%

82.4%

86.2%

92.3%

Ave Fac: ltp

79.9%

79.6%

79.3%

79.1%

75.1%

71.5%

Ave Fac: uw

93.4%

92.8%

92.4%

92.4%

95.7%

99.1%

ia_band1
Weighted Average GLM Factors for Variable: ia_band1

18-24

25-34

35-44

45-54

55-64

65-74

75-84

85-99

amount_2015vbt

107.9%

88.6%

86.7%

86.9%

88.9%

92.8%

82.7%

76.7%

Factor: ia_band1

100.0%

94.0%

95.6%

92.8%

90.4%

92.2%

84.1%

73.8%

Ave Fac: dur_band1

92.4%

91.0%

90.8%

90.9%

91.1%

90.6%

89.5%

89.7%

Ave Fac: face_amount_band

79.2%

74.8%

73.9%

74.5%

75.4%

74.6%

71.8%

71.7%

Ave Fac: gender

100.7%

100.7%

100.8%

100.8%

100.8%

100.6%

100.5%

100.4%

Ave Fac: insurance_plan

85.4%

80.9%

80.5%

81.9%

85.3%

92.5%

95.6%

95.6%

Ave Fac: iy_band1

87.5%

87.9%

87.8%

87.4%

87.4%

87.7%

86.4%

86.0%

Ave Fac: ltp

74.3%

78.3%

78.9%

78.6%

76.2%

69.9%

67.4%

67.3%

Ave Fac: uw

97.1%

92.9%

91.6%

93.0%

95.4%

99.2%

103.1%

108.5%

gender
Weighted Average GLM Factors for Variable: gender

F

M

amount_2015vbt

87.4%

87.6%

Factor: gender

100.0%

101.1%

Ave Fac: dur_band1

90.5%

90.8%

Ave Fac: face_amount_band

74.8%

73.8%

Ave Fac: ia_band1

89.8%

91.6%

Ave Fac: insurance_plan

89.3%

85.2%

Ave Fac: iy_band1

87.4%

87.3%

Ave Fac: ltp

72.2%

75.8%

Ave Fac: uw

97.5%

95.6%

insurance_plan
Weighted Average GLM Factors for Variable: insurance_plan

Other

Perm

Term

xL

amount_2015vbt

84.1%

96.4%

83.4%

89.2%

Factor: insurance_plan

100.0%

87.5%

76.0%

95.9%

Ave Fac: dur_band1

93.1%

91.0%

91.0%

90.2%

Ave Fac: face_amount_band

73.0%

79.1%

73.4%

73.7%

Ave Fac: gender

100.7%

100.7%

100.8%

100.6%

Ave Fac: ia_band1

91.2%

92.2%

93.3%

88.7%

Ave Fac: iy_band1

84.5%

91.0%

86.4%

87.4%

Ave Fac: ltp

67.3%

67.3%

84.5%

67.3%

Ave Fac: uw

97.9%

101.3%

90.5%

100.3%

ltp
Weighted Average GLM Factors for Variable: ltp

5 yr

10 yr

15 yr

20 yr

25 yr

30 yr

Not Level Term

Unknown

amount_2015vbt

117.1%

91.7%

86.2%

78.6%

91.8%

81.0%

90.2%

92.4%

Factor: ltp

100.0%

91.4%

90.8%

81.3%

78.1%

85.1%

67.3%

81.3%

Ave Fac: dur_band1

90.8%

92.2%

90.7%

90.7%

91.0%

91.1%

90.4%

91.5%

Ave Fac: face_amount_band

77.5%

73.1%

73.5%

73.2%

74.8%

72.9%

74.7%

74.6%

Ave Fac: gender

100.7%

100.9%

100.9%

100.8%

100.7%

100.7%

100.6%

100.8%

Ave Fac: ia_band1

93.9%

92.2%

92.3%

93.5%

94.4%

94.6%

89.4%

94.4%

Ave Fac: insurance_plan

76.0%

76.0%

76.0%

76.0%

76.0%

76.0%

94.1%

76.0%

Ave Fac: iy_band1

89.1%

84.5%

85.7%

86.7%

87.4%

86.5%

88.1%

88.5%

Ave Fac: uw

98.4%

93.3%

90.0%

88.9%

103.5%

86.1%

100.3%

97.8%

iy_band1
Weighted Average GLM Factors for Variable: iy_band1

1900-1989

1990-1999

2000-2009

2010+

amount_2015vbt

116.3%

99.5%

82.7%

85.3%

Factor: iy_band1

100.0%

93.0%

86.2%

81.4%

Ave Fac: dur_band1

90.9%

90.7%

89.4%

95.8%

Ave Fac: face_amount_band

82.0%

77.4%

72.9%

73.2%

Ave Fac: gender

100.8%

100.7%

100.7%

100.8%

Ave Fac: ia_band1

92.8%

92.5%

90.1%

92.1%

Ave Fac: insurance_plan

90.2%

87.8%

86.9%

83.0%

Ave Fac: ltp

68.4%

71.6%

74.8%

79.1%

Ave Fac: uw

101.9%

99.1%

95.9%

92.4%

LightGBM

Data Preparation

First, the data are prepared for LightGBM. LightGBM expects matrices for its inputs. Thereafter, the LightGBM model is trained. Factors are recast as their underlying integer indices.

Model Fitting

The LightGBM model is fit to the training subset using a Poisson objective. The model response is the ratio of response variable and response offset, and the weights are the specified offset. Often, this might be “actual claims” as the response and “expected claims” as the offset.

#==============================================================================#
#### Section 5: lightgbm ####
#==============================================================================#

#### Notes
## Helpful Resources:
## https://lightgbm.readthedocs.io/en/v3.3.2/
## https://christophm.github.io/interpretable-ml-book/shapley.html

## In this section we fit a lightgbm model, an implementation of gradient 
## boosting machines. We also extract interaction feature importance
## This is a way to determine most likely interactions for a linear model
## We also look at Shapley values which can be useful to decompose black-box 
## model predictions

#-----------------------------------------#
##### Fit model and make predictions #####
#-----------------------------------------#

## create lgbm dataset
lgbm.train <- lgb.Dataset(train.x.lgbm, 
                   label  = train.y.lgbm/train.weight.lgbm, 
                   weight = train.weight.lgbm)

lgbm.test <- lgb.Dataset.create.valid(lgbm.train,
                                      test.x.lgbm,
                                      label=test.y.lgbm/test.weight.lgbm)

## define parameters
params <- list(
          objective = "poisson",
          metric = "poisson",
          min_data_in_leaf = 500,
          learning_rate = .3,
          feature_fraction = .75,
          bagging_fraction = 0.50,
          seed = nGBMSeed
)

## train model 
if(bUseCache & file.exists(
  paste0(cacheFileRoot,"_lgb_model.txt")
  ) & !bInvalidateCaches)
{
  lgbm1 <- lgb.load(paste0(cacheFileRoot,"_lgb_model.txt"))
} else {
  lgbm1 <- lgb.train(
           params = params,
           data = lgbm.train,
           nrounds = 2000L#,  ## for demo purposes; switch back to 2000
           #valids=list(test=lgbm.test),
           #early_stopping_rounds = 10
           )
  if(bUseCache)
    lgb.save(lgbm1,paste0(cacheFileRoot,"_lgb_model.txt"))
}

## generate predictions
## note: predictions needed to be multiplied by weights, linear models do this automatically
test[,predictions_lgbm1:=predict(lgbm1, test.x.lgbm) * get(resp_offset)]
train[,predictions_lgbm1:=predict(lgbm1, train.x.lgbm) * get(resp_offset)]

Model Illustrations and Graphics

From this, we can plot decile lift and Lorenz curves.

The decile lift plot can be interpreted as a way to visualize the effectiveness of a predictive model. It divides the data into ten parts (deciles) based on the model’s predictions, from the highest probability of an event occurring to the lowest. The steeper the plot against deciles, the better the segmentation or lift. We see three lines. The “table” line indicates that the expected mortality is relatively constant across these model deciles even though the “actual” mortality and the mortality predicted by the “model” vary substantially, indicating significant risk stratification.

The Lorenz curve describes is another way of visualizing the risk stratification of the model. The more bowed the line is from the y=x axis, the greater the Gini coefficient and the greater the risk stratification.

Understanding the behavior of the interactions as well as gain and cover can give us some macro insight into what the model is doing. The feature interaction table ranks and demonstrates the most important interactions in the model. ‘gain’ refers to the improvement in accuracy brought by a feature to the branches it is on, thus indicating the feature is important. ‘cover’ measures the number of times a feature is used to split the data across all trees regardless of the gain in accuracy achieved. A high gain with a high cover suggests a feature that is very useful across many parts of the dataset.

Lift Curve

#-----------------------------------------#
##### Validation Metrics #####
#-----------------------------------------#

## generate plot
test[,decile.table(get(resp_var),predictions_lgbm1/get(resp_offset),
                   get(resp_offset))] %>%
  pivot_longer(-c(decile,exposures)) %>% 
  as.data.table() %>%
  ggplot(aes(x=decile, y=value, col=name)) +  
  geom_line() +
  scale_x_continuous(breaks=c(1:10)) +
  labs(x="Decile",y="Response")+
  ggtitle("Decile Lift Plot") +
  theme_minimal() +
  scale_y_continuous(labels=scales::comma)

Lorenz Curve

## lorenz plot
test[,lorenz(get(resp_var), predictions_lgbm1 / get(resp_offset), 
             get(resp_offset))]

Feature Importance

The following plot is the feature importance plot which ranks the mean absolute SHAP value for a given feature. It should be noted that being low on the list does not automatically imply that a feature is unimportant. Due to phenomena such as aggregation bias, features with relatively higher numbers of levels can seemingly rank higher than those with lower numbers of levels. Here, the top three tend to have large numbers of levels versus the bottom four.

#-----------------------------------------#
##### Feature Importance #####
#-----------------------------------------#

## get most important features
if(bUseCache & file.exists(
  paste0(cacheFileRoot,"_lgb_imp.rds")
  ) & !bInvalidateCaches) {
  imp <- readRDS(paste0(cacheFileRoot,"_lgb_imp.rds"))
} else {
  imp <- lgb.importance(lgbm1, percentage = TRUE)
  if(bUseCache)
    saveRDS(imp,paste0(cacheFileRoot,"_lgb_imp.rds"))
}

## get most important interactions from EIX library
## warning: very slow
if(bUseCache & file.exists(
  paste0(cacheFileRoot,"_lgb_imp_int.rds")
  ) & !bInvalidateCaches) {
  imp.int <- readRDS(paste0(cacheFileRoot,"_lgb_imp_int.rds"))
} else {
  imp.int <- importance(lgbm1, sm, option = "interactions")
  if(bUseCache)
    saveRDS(imp.int,paste0(cacheFileRoot,"_lgb_imp_int.rds"))
}

#-----------------------------------------#
##### Shap Values #####
#-----------------------------------------#

## get shap values for lightgbm
if(bUseCache & file.exists(
  paste0(cacheFileRoot,"_lgb_shap.rds")
  ) & !bInvalidateCaches) {
  shap_lgbm <- readRDS(paste0(cacheFileRoot,"_lgb_shap.rds"))
} else {
  shap_lgbm <- as.data.table(
    predict(lgbm1, 
            test.x.lgbm, 
            rawscore = FALSE, 
            predcontrib = TRUE) 
    ) %>%
    setnames(names(.),
             c(colnames(test.x.lgbm),"BIAS")
    )
  shap_lgbm[,pred:=exp(Reduce('+',.SD))*test.weight.lgbm] ## reproduce model predictions
  if(bUseCache)
    saveRDS(shap_lgbm,paste0(cacheFileRoot,"_lgb_shap.rds"))
}

set.seed(1337)
if(flgbm_vis_subset < 1 ) {
  shp_int_subset <- sample.int(n=nrow(train),
                             size=nrow(train)*flgbm_vis_subset)
} else {
  shp_int_subset <- 1:nrow(train)
}

if(bUseCache & file.exists(
  paste0(cacheFileRoot,"_lgb_shapviz.rds")
  ) & !bInvalidateCaches)
{
  shp <- readRDS(paste0(cacheFileRoot,"_lgb_shapviz.rds"))
} else {
  shp <- shapviz(
    lgbm1,
    X_pred=train.x.lgbm[shp_int_subset,],
    X=train[shp_int_subset]
  )
  
  setDT(shp$X)
  
  if(bUseCache)
    saveRDS(shp,paste0(cacheFileRoot,"_lgb_shapviz.rds"))
}

## Feature importance
sv_importance(shp) + theme_minimal()

Feature Interaction Table

We also develop a table of interaction strengths, sorted by the total contribution to explaining variation in the data. Again, aggregation bias can distort the ranking, so interpreting the ranking should be taken with caution.

## Convert the 'imp.int' object to a data.table
imp.int <- data.table(imp.int)

## Create a new column 'Feature2' by sorting and collapsing elements of 'Feature'
imp.int[, Feature2 := sapply(Feature, FUN = function(f) {
  paste(sort(unlist(strsplit(f, ":"))), collapse = ":")
})]

## Aggregate 'sumGain', 'sumCover', and 'frequency' by the new 'Feature2' column
imp.int2 <- imp.int[, .(
  sumGain = sum(sumGain),
  sumCover = sum(sumCover),
  frequency = sum(frequency)
), by = .(Feature = Feature2)]

## Calculate additional metrics: meanCover, meanGain, sumGainPct, sumCoverPct
imp.int2[, `:=`(
  meanCover = sumCover / frequency,
  meanGain = sumGain / frequency,
  sumGainPct = sumGain / sum(sumGain),
  sumCoverPct = sumCover / sum(sumCover)
)]

## Split 'Feature' into 'Feature1' and 'Feature2' columns
imp.int2[, c("Feature1", "Feature2") := tstrsplit(Feature, ":")]

## Order by 'sumGain' in descending order and create a flextable with scrollable properties
imp.int2[order(-sumGain)] %>%
  flextable() %>%
  set_table_properties(opts_html = list(
    scroll = list(
      add_css = "max-height: 500px;"
    )
  ))

Feature

sumGain

sumCover

frequency

meanCover

meanGain

sumGainPct

sumCoverPct

Feature1

Feature2

ia_band1:uw

1,228,300,000

22,200,000

3,003

7,393

409,024

0.101945

0.160579

ia_band1

uw

face_amount_band:uw

869,600,000

6,159,000

2,112

2,916

411,742

0.072174

0.044550

face_amount_band

uw

face_amount_band:ia_band1

793,300,000

6,354,000

2,185

2,908

363,066

0.065841

0.045960

face_amount_band

ia_band1

dur_band1:uw

745,800,000

6,395,000

1,677

3,813

444,723

0.061899

0.046257

dur_band1

uw

dur_band1:ia_band1

658,700,000

10,341,000

1,739

5,947

378,781

0.054670

0.074800

dur_band1

ia_band1

dur_band1:face_amount_band

642,300,000

3,509,000

1,522

2,306

422,011

0.053309

0.025382

dur_band1

face_amount_band

dur_band1:ltp

602,600,000

4,825,000

1,101

4,382

547,321

0.050014

0.034901

dur_band1

ltp

ltp:uw

539,000,000

8,995,000

1,432

6,281

376,397

0.044735

0.065064

ltp

uw

ia_band1:ltp

499,900,000

6,590,000

1,294

5,093

386,321

0.041490

0.047667

ia_band1

ltp

insurance_plan:uw

450,500,000

2,941,000

683

4,306

659,590

0.037390

0.021273

insurance_plan

uw

face_amount_band:gender

432,800,000

2,219,600

1,018

2,180

425,147

0.035921

0.016055

face_amount_band

gender

gender:uw

430,400,000

2,691,000

1,119

2,405

384,629

0.035722

0.019465

gender

uw

gender:ia_band1

420,000,000

3,063,000

1,136

2,696

369,718

0.034859

0.022156

gender

ia_band1

face_amount_band:ltp

414,800,000

4,405,000

1,179

3,736

351,824

0.034427

0.031863

face_amount_band

ltp

iy_band1:uw

386,600,000

7,947,000

1,133

7,014

341,218

0.032087

0.057483

iy_band1

uw

ia_band1:iy_band1

384,800,000

6,330,000

1,041

6,081

369,645

0.031937

0.045787

ia_band1

iy_band1

face_amount_band:iy_band1

374,000,000

4,715,000

1,023

4,609

365,591

0.031041

0.034105

face_amount_band

iy_band1

ia_band1:insurance_plan

370,400,000

7,233,000

964

7,503

384,232

0.030742

0.052318

ia_band1

insurance_plan

face_amount_band:insurance_plan

321,900,000

2,630,400

734

3,584

438,556

0.026717

0.019026

face_amount_band

insurance_plan

dur_band1:gender

245,090,000

1,512,500

688

2,198

356,235

0.020342

0.010940

dur_band1

gender

dur_band1:iy_band1

210,720,000

3,843,000

631

6,090

333,946

0.017489

0.027798

dur_band1

iy_band1

dur_band1:insurance_plan

200,320,000

1,770,400

529

3,347

378,677

0.016626

0.012806

dur_band1

insurance_plan

insurance_plan:ltp

186,950,000

1,353,500

164

8,253

1,139,939

0.015516

0.009790

insurance_plan

ltp

gender:insurance_plan

145,750,000

837,300

310

2,701

470,161

0.012097

0.006056

gender

insurance_plan

insurance_plan:iy_band1

133,400,000

3,095,000

427

7,248

312,412

0.011072

0.022387

insurance_plan

iy_band1

iy_band1:ltp

129,660,000

3,644,000

556

6,554

233,201

0.010761

0.026358

iy_band1

ltp

gender:iy_band1

117,310,000

1,447,700

421

3,439

278,646

0.009736

0.010472

gender

iy_band1

gender:ltp

113,770,000

1,203,000

471

2,554

241,550

0.009443

0.008702

gender

ltp

Gain vs. Cover

As noted above, ‘gain’ refers to the improvement in accuracy brought by a feature to the branches it is on, thus indicating the feature is important. ‘cover’ measures the number of times a feature is used to split the data across all trees regardless of the gain in accuracy achieved. A high gain with a high cover suggests a feature that is very useful across many parts of the dataset.

## Create scatter plot
ggplot(imp.int2, aes(x = sumCover, y = sumGain, label = Feature)) + 
  geom_point() + 
  scale_size() +  
  ggrepel::geom_label_repel() +  # Add labels with repulsion
  theme_minimal()  

Feature Plots

It is useful to plot SHAP values for their main effects (e.g., SHAP values for face amount band by face amount band) as well as interactions (e.g., same, but stratified in some way by other variables). Traditionally, scatter plots are used. However, due to overplotting, it is not clear what is going on with the SHAP values. Here we use boxplots of the SHAP values instead of scatter plotting. This provides a sense of the spread of the SHAP values along with the median and outliers. This is particularly useful for qualitatively evaluating whether there are any meaningful interactions.

In what follows, red diamonds are mean SHAP values, while blue squares are mean mortality from a subset of the data. Note that SHAP values are partial effects which work in concern with the other features. Therefore, the mean actual mortality will not necessarily be captured by the variability of the feature SHAP values.

## Load external R script
source("R/ilec_shap_plot.R")

## Select top features to plot
featurestoplot <- imp[1:nPlotTopFeatures, Feature]

## Initialize plot list
plist <- list()

## Loop through top features
for (i in 1:nPlotTopFeatures) {
  ## Filter interactions for current feature
  int.vars <- imp.int2[featurestoplot[i] == Feature1 | featurestoplot[i] == Feature2] %>%
    head(nPlotTopInteractions) %>%
    select(Feature1, Feature2) %>%
    pivot_longer(cols = c(Feature1, Feature2), values_to = "Feature") %>%
    distinct() %>%
    filter(Feature != featurestoplot[i])
  
  ## Add SHAP plot to list
  plist <- c(plist, 
             ilec_shap_plot(
               shp,
               featurestoplot[i],
               int.vars$Feature,
               resp_var = resp_var,
               resp_offset = resp_offset,
               train.data = train[shp_int_subset]
             )
  )
}

## Print plots with headers
plist %>%
  iwalk(~ {
    cat('#### ', .y, '\n\n')
    print(.x)
    cat('\n\n')
  })

main effect: uw

uw x ia_band1

uw x face_amount_band

uw x dur_band1

main effect: face_amount_band

face_amount_band x uw

face_amount_band x ia_band1

face_amount_band x dur_band1

main effect: ia_band1

ia_band1 x uw

ia_band1 x face_amount_band

ia_band1 x dur_band1

Some patterns are noticeable. We discuss them for each group. You can visually detect an interaction by checking whether the box plots are all on the same level or not for a given subgroup.

Underwriting (uw)

  1. Main effect
    1. The spreads from highest to lowest risk classes are similar across non-smoker class systems.
    2. Smoker differentiation is narrower than for 2-class non-smokers.
  2. Interaction with face amount band
    1. The interaction between underwriting and face amount band, for the underwriting effect, appears confined mostly to 3-class non-smoker (N/3/*). Higher face amount bands ($250K+ in the light dataset) appear to have larger spread of effects.
  3. Interaction with issue age band: possible narrowing at older ages for 2- or 4-class non-smokers
  4. Interaction with observation year: possible widening of spread of 4-class non-smokers with increasing observation year

Face Amount Band (face_amount_band)

  1. Main effect: expected decrease as face amount band increases
  2. Interaction with underwriting: face amount effect may be interacting with the Unknown smoker category
  3. Interaction with issue age band:
    1. Decreasing effect by issue age for lower bands, flipping to increasing effect by issue age for upper issue age bands
    2. Put another way, spread of face amount effects decreases with increasing issue age
  4. Interaction with Observation Year: no obvious effect

Issue Age Band (ia_band1)

  1. Main Effect: With the exception of ages 18-24, decreasing issue age effect by issue age
  2. Interaction with face amount band: similar to face amount band, spread decreases with increasing issue age
  3. Interaction with underwriting: substantial changes above issue age 75, qualitatively negligible below age 75
  4. Interaction with observation year: no obvious interaction

Goodness-of-Fit

Goodness-of-fit tables are provided. Each table provides actual-to-model ratios for single variables and for 2-way combinations of variables. A model is qualitatively deemed to perform well if goodness-of-fit ratios are close to 100% in almost all situations. The quantitative assessment using significance testing is omitted here.

Unvariate Goodness-of-Fit

## Process each factor column and create formatted tables
map(factor_cols, .f = \(x) {
  ## Convert column name to symbol
  x <- sym(x)
  
  ## Group data by factor column and calculate summary statistics
  train %>%
    group_by(!!x) %>%
    summarize(
      Outcome = sum(amount_actual),
      AM = sum(amount_actual) / sum(predictions_lgbm1)
    ) %>%
    flextable() %>%  # Create a flextable
    set_formatter(  # Format 'AM' as a percentage
      AM = function(x) {
        if (is.numeric(x))
          sprintf("%.1f%%", x * 100)
        else
          x
      }
    ) %>%
    colformat_num(j = "Outcome") %>%  # Format 'Outcome' column
    set_header_labels(  # Set custom header labels
      Outcome = "Outcome",
      AM = "Actual-to-Model"
    ) %>%
    autofit() #%>%
    #knitr::knit_print()  # Print the table in a format suitable for knitting
}) %>%   # Set names for each element in the list
  purrr::set_names(factor_cols) ->
  output_tables

if(output_format == "html") {
  output_tables %>%
    map(.f=knitr::knit_print) %>%
    generate_tabset(  # Generate tabset from the list of tables
      tabtitle = "",
      tablevel = 4
    ) %>%
    cat()  # Print the tabset
} else {
  export_tables_to_excel(
    output_tables,
    paste0(
      exportsRoot,
      "_lightgbm_univariate_goodness_of_fit_tables.xlsx"
    )
  )
  
  cat("See included Excel table for additional information.\n")
  
}
uw

uw

Outcome

Actual-to-Model

N/1/1

19,203,624,009

100.0%

N/2/1

9,661,768,053

100.0%

N/2/2

11,546,531,146

100.0%

N/3/1

5,950,371,727

99.9%

N/3/2

8,810,968,555

100.0%

N/3/3

14,970,350,057

100.0%

N/4/1

6,842,959,121

100.0%

N/4/2

5,162,511,227

100.0%

N/4/3

3,565,224,892

100.0%

N/4/4

3,742,908,023

100.1%

S/1/1

4,365,025,076

100.0%

S/2/1

1,910,521,866

100.1%

S/2/2

1,704,933,729

100.0%

U/1/1

498,038,979

100.1%

face_amount_band

face_amount_band

Outcome

Actual-to-Model

01 - 0 - 49,999

4,193,809,072

100.0%

04 - 50,000 - 99,999

5,283,024,574

100.0%

05 - 100,000 - 249,999

16,241,007,604

100.0%

06 - 250,000 - 499,999

14,745,604,133

100.0%

07 - 500,000 - 999,999

15,180,744,107

100.0%

08 - 1,000,000+

42,291,546,970

100.0%

dur_band1

dur_band1

Outcome

Actual-to-Model

01

1,269,136,295

100.1%

02

1,852,767,081

100.0%

03

2,494,157,892

99.9%

04-05

6,321,445,332

100.1%

06-15

55,677,490,288

100.0%

16-25

30,320,739,572

100.0%

ia_band1

ia_band1

Outcome

Actual-to-Model

18-24

667,931,081

99.9%

25-34

6,267,527,815

100.0%

35-44

16,990,453,137

100.0%

45-54

20,629,510,315

100.0%

55-64

19,447,152,090

100.0%

65-74

16,608,837,516

100.0%

75-84

15,884,011,950

100.0%

85-99

1,440,312,556

100.0%

gender

gender

Outcome

Actual-to-Model

F

33,257,191,935

100.0%

M

64,678,544,525

100.0%

insurance_plan

insurance_plan

Outcome

Actual-to-Model

Other

187,352,300

99.4%

Perm

12,161,923,261

100.0%

Term

39,023,616,053

100.0%

xL

46,562,844,846

100.0%

ltp

ltp

Outcome

Actual-to-Model

5 yr

686,343,514

100.0%

10 yr

7,271,664,723

100.0%

15 yr

6,017,756,560

100.0%

20 yr

17,767,048,607

100.0%

25 yr

525,222,922

100.3%

30 yr

3,092,731,785

100.1%

Not Level Term

59,614,315,535

100.0%

Unknown

2,960,652,814

100.0%

iy_band1

iy_band1

Outcome

Actual-to-Model

1900-1989

1,464,179,514

100.0%

1990-1999

29,544,147,346

100.0%

2000-2009

55,045,293,308

100.0%

2010+

11,882,116,292

100.0%

Bivariate Goodness-of-Fit

## Create a list of unique pairs of factor columns
pairlist <- data.table()
for (i in 1:(length(factor_cols) - 1)) {
  for (j in (i + 1):length(factor_cols)) {
    pairlist <- rbind(pairlist, data.table(F1 = factor_cols[i], F2 = factor_cols[j]))
  }
}

## Generate summary tables and formatted outputs for each pair of factor columns
map2(.x = pairlist$F1, .y = pairlist$F2, .f = \(x, y) {
  ## Convert column names to symbols for tidy evaluation
  xs <- sym(x)
  ys <- sym(y)
  
  ## Choose grouping order based on the number of levels in each factor
  ## Group by the factor with more levels first for better summarization
  if (length(train[, levels(get(x))]) >= length(train[, levels(get(y))])) {
    fttmp <- train %>%
      group_by(!!xs, !!ys) %>%
      summarize(
        Outcome = sum(amount_actual),  # Calculate the total outcome
        Ratio = sprintf("%.1f%%", 100 * sum(amount_actual) / sum(predictions_lgbm1))  # Calculate the actual-to-model ratio
      ) %>%
      pivot_wider(
        names_from = !!ys,  # Pivot the data to widen by the second factor
        values_from = c(Outcome, Ratio),  # Use Outcome and Ratio as values
        names_glue = paste0(y, ": {", y, "}.{.value}"),  # Create new column names using glue syntax
        names_vary = "slowest"  # Handle varying names by the slowest changing variable
      )
  } else {
    fttmp <- train %>%
      group_by(!!ys, !!xs) %>%
      summarize(
        Outcome = sum(amount_actual),  # Calculate the total outcome
        Ratio = sprintf("%.1f%%", 100 * sum(amount_actual) / sum(predictions_lgbm1))  # Calculate the actual-to-model ratio
      ) %>%
      pivot_wider(
        names_from = !!xs,  # Pivot the data to widen by the first factor
        values_from = c(Outcome, Ratio),  # Use Outcome and Ratio as values
        names_glue = paste0(x, ": {", x, "}.{.value}"),  # Create new column names using glue syntax
        names_vary = "slowest"  # Handle varying names by the slowest changing variable
      )
  }
  
  ## Adjust column keys for the flextable
  ## Start with the first column name
  fttmp.colkeys <- names(fttmp)[1]
  ## Add pairs of Outcome and Ratio columns, inserting a blank column between each pair
  for (i in 1:((length(names(fttmp)) - 1) / 2)) {
    fttmp.colkeys <- c(fttmp.colkeys, paste0("blank", i), names(fttmp)[(2 * i):(2 * i + 1)])
  }
  
  ## Create and print the flextable
  fttmp %>%
    flextable(col_keys = fttmp.colkeys) %>%
    ftExtra::span_header(sep = "\\.") %>%
    align(align = 'center', part = "all") %>%
    empty_blanks() %>%
    autofit() #%>%
    #knitr::knit_print()
}) %>%
  ## Set names for each element in the list based on the factor column pairs
  purrr::set_names(pairlist[, paste0(F1, " x ", F2)]) ->
  output_tables

if(output_format == "html") {
  output_tables %>%
    map(.f=knitr::knit_print) %>%
    ## Generate a tabset from the list of formatted tables
    generate_tabset(tabtitle = "", tablevel = 4) %>%
    ## Print the generated tabset
    cat()  
} else {
  export_tables_to_excel(
    output_tables,
    paste0(
      exportsRoot,
      "_lightgbm_bivariate_goodness_of_fit_tables.xlsx"
    )
  )
  
  cat("See included Excel table for additional information.\n")
}
uw x face_amount_band

uw

face_amount_band: 01 - 0 - 49,999

face_amount_band: 04 - 50,000 - 99,999

face_amount_band: 05 - 100,000 - 249,999

face_amount_band: 06 - 250,000 - 499,999

face_amount_band: 07 - 500,000 - 999,999

face_amount_band: 08 - 1,000,000+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

2,331,609,989

100.1%

2,665,573,911

99.9%

4,377,227,666

100.0%

2,484,222,240

99.9%

2,280,087,941

100.0%

5,064,902,262

100.0%

N/2/1

67,510,912

100.0%

227,491,364

100.0%

1,877,102,924

100.0%

1,581,998,866

100.0%

1,613,477,143

100.2%

4,294,186,844

100.0%

N/2/2

478,620,100

100.0%

716,716,216

100.1%

2,460,749,546

100.1%

1,807,687,840

99.7%

1,536,284,872

100.5%

4,546,472,572

99.9%

N/3/1

4,589,248

106.8%

36,548,841

98.6%

636,379,622

100.1%

1,099,392,788

99.9%

1,269,934,939

100.3%

2,903,526,289

99.8%

N/3/2

15,879,147

95.9%

84,477,290

103.2%

824,647,078

99.4%

1,194,151,445

100.7%

1,294,166,133

99.2%

5,397,647,462

100.1%

N/3/3

89,078,013

100.0%

266,651,094

99.1%

1,685,052,988

100.4%

1,897,715,243

99.7%

1,940,524,074

100.0%

9,091,328,645

100.0%

N/4/1

2,890,726

85.8%

23,168,994

99.2%

533,862,585

99.9%

1,152,655,538

100.3%

1,611,064,623

99.9%

3,519,316,655

99.9%

N/4/2

5,630,152

96.8%

43,592,461

100.0%

580,655,929

99.4%

959,152,506

100.7%

1,142,130,198

99.7%

2,431,349,981

100.0%

N/4/3

2,758,598

104.1%

19,493,024

94.9%

410,982,168

100.6%

652,807,204

99.4%

822,151,439

100.4%

1,657,032,459

100.0%

N/4/4

16,764,842

98.8%

69,558,070

102.0%

501,578,433

100.1%

668,382,546

99.8%

698,077,315

99.8%

1,788,546,817

100.2%

S/1/1

846,343,572

100.0%

864,666,577

100.1%

1,161,486,983

99.6%

510,128,489

100.4%

361,498,469

100.2%

620,900,986

100.1%

S/2/1

21,486,405

97.5%

85,321,233

100.7%

614,066,701

100.0%

400,530,746

100.3%

315,167,594

100.0%

473,949,187

100.1%

S/2/2

74,338,988

100.4%

135,934,467

100.0%

492,786,516

100.4%

294,838,420

100.1%

267,040,410

100.1%

439,994,928

99.2%

U/1/1

236,308,380

99.8%

43,831,032

98.9%

84,428,465

100.6%

41,940,262

99.8%

29,138,957

100.6%

62,391,883

101.1%

uw x dur_band1

uw

dur_band1: 01

dur_band1: 02

dur_band1: 03

dur_band1: 04-05

dur_band1: 06-15

dur_band1: 16-25

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

107,404,206

99.1%

127,611,866

99.4%

186,488,501

100.0%

495,415,170

100.6%

4,962,783,774

99.9%

13,323,920,492

100.0%

N/2/1

38,939,578

98.9%

117,924,754

100.1%

103,706,358

100.7%

187,685,672

99.3%

4,423,311,154

100.2%

4,790,200,537

99.9%

N/2/2

68,820,764

102.6%

89,429,859

99.0%

153,874,204

98.5%

321,513,019

100.8%

6,248,105,591

100.0%

4,664,787,709

100.0%

N/3/1

91,917,216

100.0%

162,393,895

99.7%

239,046,418

100.4%

463,263,819

99.3%

4,110,289,163

100.1%

883,461,216

99.6%

N/3/2

73,068,462

98.2%

142,447,178

101.1%

172,423,532

100.0%

471,435,974

99.7%

7,112,246,310

99.9%

839,347,099

100.7%

N/3/3

209,144,931

100.7%

274,764,174

99.7%

374,385,325

100.5%

1,050,466,173

100.3%

12,379,149,521

99.9%

682,439,933

100.2%

N/4/1

174,316,019

100.9%

228,792,240

99.9%

389,498,840

100.4%

1,034,633,390

99.5%

4,624,074,611

100.1%

391,644,021

99.4%

N/4/2

115,650,082

100.3%

173,110,541

100.1%

196,392,926

99.1%

696,533,900

100.5%

3,717,590,891

100.0%

263,232,887

99.1%

N/4/3

97,264,439

99.8%

192,504,643

101.2%

206,119,588

98.6%

501,287,156

100.6%

2,450,944,955

99.9%

117,104,111

100.4%

N/4/4

153,464,379

99.7%

195,703,696

98.8%

283,447,385

100.5%

660,958,667

100.1%

2,334,500,075

100.3%

114,833,821

96.6%

S/1/1

30,145,775

98.2%

28,847,772

99.1%

40,393,654

98.6%

91,257,938

99.4%

1,121,084,058

99.8%

3,053,295,879

100.1%

S/2/1

42,753,182

99.7%

58,621,979

102.5%

74,856,996

99.8%

177,694,957

99.6%

1,089,203,131

100.0%

467,391,621

100.3%

S/2/2

55,705,006

100.9%

55,852,780

99.5%

68,480,349

99.9%

160,693,564

100.5%

985,575,007

100.0%

378,627,023

99.7%

U/1/1

10,542,256

96.8%

4,761,704

99.2%

5,043,816

98.7%

8,605,933

94.6%

118,632,047

101.2%

350,453,223

100.0%

uw x ia_band1

uw

ia_band1: 18-24

ia_band1: 25-34

ia_band1: 35-44

ia_band1: 45-54

ia_band1: 55-64

ia_band1: 65-74

ia_band1: 75-84

ia_band1: 85-99

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

195,107,686

99.3%

1,468,766,511

99.9%

3,219,766,527

100.1%

3,573,783,086

100.0%

4,457,649,299

100.0%

4,148,815,415

100.0%

1,980,748,076

100.3%

158,987,409

94.0%

N/2/1

87,770,402

102.3%

804,579,965

99.9%

1,676,447,802

99.8%

1,660,839,957

100.1%

1,585,401,151

99.8%

2,098,151,266

100.3%

1,649,244,265

99.4%

99,333,245

112.7%

N/2/2

82,359,855

100.2%

734,969,884

99.5%

1,537,827,224

100.1%

1,935,817,830

99.9%

1,907,603,917

100.5%

2,320,063,956

99.9%

2,742,077,451

100.0%

285,811,029

99.4%

N/3/1

42,971,227

98.3%

542,310,521

100.6%

1,509,846,536

100.2%

1,517,047,115

99.3%

1,083,241,294

100.8%

682,592,797

98.7%

557,962,237

100.5%

14,400,000

101.6%

N/3/2

26,380,775

99.8%

339,627,773

99.6%

1,177,249,350

100.1%

1,552,812,212

99.8%

1,501,263,082

99.3%

1,589,866,999

100.3%

2,421,887,776

100.3%

201,880,588

99.8%

N/3/3

29,150,024

100.1%

391,634,706

99.8%

1,364,327,286

99.6%

1,932,181,705

100.6%

2,306,037,454

99.7%

2,995,484,451

100.0%

5,412,062,505

100.0%

539,471,926

100.1%

N/4/1

26,287,124

97.1%

602,385,596

101.0%

2,097,046,161

99.9%

2,253,038,301

99.8%

1,386,295,039

100.4%

431,665,885

101.7%

43,983,146

78.3%

2,257,869

199.0%

N/4/2

9,905,787

106.8%

233,732,519

99.9%

1,112,629,077

100.1%

1,595,224,194

99.9%

1,413,399,308

100.1%

448,165,240

98.8%

313,744,476

101.6%

35,710,626

97.9%

N/4/3

7,030,001

98.6%

209,711,793

100.1%

705,969,366

99.7%

1,147,207,102

100.7%

986,258,205

99.3%

356,988,649

100.5%

140,290,698

99.1%

11,769,078

103.5%

N/4/4

9,949,832

98.4%

167,099,067

99.2%

638,256,939

100.7%

1,003,674,951

99.7%

958,007,317

100.3%

551,614,283

99.7%

345,232,117

99.8%

69,073,517

101.1%

S/1/1

68,766,581

100.3%

451,737,170

100.2%

1,077,472,651

99.9%

1,226,674,054

99.9%

936,351,195

100.1%

461,642,247

99.8%

139,279,941

102.4%

3,101,237

49.6%

S/2/1

26,876,275

100.6%

179,096,556

99.3%

474,899,912

99.9%

608,218,812

100.3%

402,893,657

99.5%

174,002,125

101.3%

42,414,641

99.5%

2,119,888

284.2%

S/2/2

17,471,313

100.2%

117,414,041

99.5%

333,323,589

100.0%

526,461,474

100.3%

400,538,301

100.1%

233,737,976

99.4%

63,750,567

97.4%

12,236,468

111.0%

U/1/1

37,904,199

98.8%

24,461,713

99.4%

65,390,717

101.3%

96,529,522

99.9%

122,212,871

98.6%

116,046,227

101.2%

31,334,054

102.6%

4,159,676

92.1%

uw x gender

uw

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

N/1/1

6,871,313,340

100.0%

12,332,310,669

100.0%

N/2/1

3,739,559,716

100.1%

5,922,208,337

100.0%

N/2/2

4,668,956,646

100.0%

6,877,574,500

100.0%

N/3/1

1,982,219,430

100.0%

3,968,152,297

99.9%

N/3/2

3,036,089,596

100.1%

5,774,878,959

99.9%

N/3/3

5,764,067,198

99.8%

9,206,282,859

100.1%

N/4/1

1,822,625,798

100.0%

5,020,333,323

100.0%

N/4/2

997,342,369

100.0%

4,165,168,858

100.0%

N/4/3

561,153,990

100.3%

3,004,070,902

99.9%

N/4/4

894,420,215

99.9%

2,848,487,808

100.1%

S/1/1

1,618,467,456

100.2%

2,746,557,620

99.9%

S/2/1

534,567,286

99.9%

1,375,954,580

100.1%

S/2/2

553,360,607

100.0%

1,151,573,122

99.9%

U/1/1

213,048,288

99.9%

284,990,691

100.2%

uw x insurance_plan

uw

insurance_plan: Other

insurance_plan: Perm

insurance_plan: Term

insurance_plan: xL

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

41,338,662

100.4%

6,000,364,392

100.0%

3,545,626,085

100.0%

9,616,294,870

100.0%

N/2/1

10,081,665

103.9%

1,335,375,364

100.1%

2,232,053,273

99.9%

6,084,257,751

100.1%

N/2/2

15,784,451

94.9%

1,659,464,730

100.0%

3,001,480,499

99.8%

6,869,801,466

100.1%

N/3/1

40,123,942

97.1%

317,915,492

99.5%

3,500,287,255

100.1%

2,092,045,038

99.9%

N/3/2

20,508,375

101.7%

282,009,643

100.0%

3,266,885,153

100.0%

5,241,565,384

100.0%

N/3/3

32,085,711

102.2%

496,907,377

100.1%

4,052,652,325

100.0%

10,388,704,644

100.0%

N/4/1

2,300,000

104.3%

9,330,053

119.4%

6,251,798,169

100.0%

579,530,899

99.3%

N/4/2

7,223,579

96.0%

7,940,683

103.8%

4,382,666,145

100.0%

764,680,820

100.2%

N/4/3

103,684

11.3%

2,495,706

83.6%

3,036,110,841

100.0%

526,514,661

100.0%

N/4/4

3,739,130

125.0%

12,550,388

103.6%

2,742,467,039

100.0%

984,151,466

100.1%

S/1/1

3,801,542

108.0%

1,397,404,613

100.1%

1,062,735,519

99.6%

1,901,083,402

100.1%

S/2/1

6,544,922

101.5%

163,842,775

99.2%

1,131,185,266

100.2%

608,948,903

100.2%

S/2/2

3,059,454

81.3%

189,503,330

100.0%

778,511,625

100.0%

733,859,320

100.0%

U/1/1

657,183

91.0%

286,818,715

100.1%

39,156,859

97.3%

171,406,222

100.6%

uw x ltp

uw

ltp: 5 yr

ltp: 10 yr

ltp: 15 yr

ltp: 20 yr

ltp: 25 yr

ltp: 30 yr

ltp: Not Level Term

ltp: Unknown

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

167,693,145

100.8%

329,259,367

99.4%

313,430,210

100.3%

1,254,368,612

99.9%

22,798,990

101.7%

140,376,344

100.1%

15,709,773,948

100.0%

1,265,923,393

100.0%

N/2/1

109,944,549

99.6%

251,674,560

99.9%

154,929,926

101.5%

1,003,102,210

99.7%

54,969,322

100.7%

91,526,385

101.6%

7,455,632,156

100.1%

539,988,945

99.7%

N/2/2

101,292,185

99.8%

464,466,296

100.6%

319,758,321

98.0%

1,222,999,988

100.1%

275,278,999

101.9%

237,099,784

98.2%

8,573,983,079

100.0%

351,652,494

100.2%

N/3/1

41,020,000

100.9%

402,661,935

99.3%

437,170,279

101.0%

1,876,102,407

99.6%

32,892,500

96.7%

371,739,248

101.5%

2,700,444,573

100.0%

88,340,785

98.1%

N/3/2

62,023,000

99.2%

523,988,195

100.0%

476,579,573

99.5%

1,824,845,270

100.4%

12,110,000

84.6%

210,771,725

98.0%

5,612,039,277

100.0%

88,611,515

102.6%

N/3/3

59,980,899

97.5%

896,107,443

100.4%

652,611,790

100.4%

1,842,591,351

100.3%

16,475,000

93.1%

201,629,249

99.2%

11,161,374,076

100.0%

139,580,249

98.7%

N/4/1

4,410,000

130.9%

1,142,209,199

99.8%

1,124,192,978

99.7%

3,080,000,938

99.9%

36,372,000

101.2%

845,189,202

100.8%

591,162,804

99.6%

19,422,000

108.1%

N/4/2

985,000

72.4%

899,393,774

100.3%

918,062,758

100.1%

2,118,167,322

99.9%

14,432,100

92.3%

414,673,900

99.7%

779,845,082

100.2%

16,951,291

96.6%

N/4/3

300,000

74.6%

711,850,279

99.6%

663,787,225

100.1%

1,348,193,434

99.9%

8,775,000

136.9%

270,343,655

100.8%

529,114,051

99.8%

32,861,248

102.7%

N/4/4

1,900,000

140.0%

861,440,026

99.9%

549,878,289

99.9%

1,117,119,304

100.2%

8,832,928

90.4%

186,021,492

99.5%

1,000,440,984

100.2%

17,275,000

100.8%

S/1/1

77,649,707

99.6%

220,477,191

101.2%

92,140,018

97.7%

325,974,726

100.1%

37,348,584

99.0%

43,126,718

94.3%

3,324,217,468

100.1%

244,090,664

99.9%

S/2/1

35,606,411

101.5%

312,687,204

99.6%

190,844,063

100.8%

450,793,441

99.4%

2,925,000

122.6%

50,493,377

103.5%

784,528,034

100.0%

82,644,336

101.0%

S/2/2

19,448,975

97.3%

251,942,737

99.9%

122,991,130

100.9%

294,652,404

100.4%

1,050,000

64.9%

28,434,706

101.5%

932,515,030

99.7%

53,898,747

100.6%

U/1/1

4,089,643

104.6%

3,506,517

89.2%

1,380,000

54.0%

8,137,200

120.8%

962,499

80.5%

1,306,000

78.4%

459,244,973

100.3%

19,412,147

97.5%

uw x iy_band1

uw

iy_band1: 1900-1989

iy_band1: 1990-1999

iy_band1: 2000-2009

iy_band1: 2010+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

982,160,538

99.7%

12,534,732,161

100.0%

4,842,664,276

100.0%

844,067,034

99.9%

N/2/1

112,802,132

103.1%

4,655,974,027

99.9%

4,443,736,468

100.1%

449,255,426

100.3%

N/2/2

107,232,938

102.8%

4,512,424,274

100.0%

6,312,557,239

100.0%

614,316,695

99.3%

N/3/1

282,000

31.1%

1,044,203,844

99.8%

3,887,317,728

99.8%

1,018,568,155

100.6%

N/3/2

250,000

17.7%

1,008,204,301

100.5%

6,934,988,157

100.0%

867,526,097

99.5%

N/3/3

4,055,561

57.8%

755,727,777

100.2%

12,308,187,396

100.0%

1,902,379,323

99.9%

N/4/1

0

0.0%

394,664,512

100.2%

4,606,160,597

99.9%

1,842,134,012

100.0%

N/4/2

308,834,925

99.5%

3,702,484,884

100.0%

1,151,191,418

100.1%

N/4/3

158,854,990

100.9%

2,386,096,189

99.8%

1,020,273,713

100.3%

N/4/4

0

0.0%

146,243,472

99.0%

2,314,552,528

100.3%

1,282,112,023

99.8%

S/1/1

212,607,764

100.1%

2,836,569,692

100.1%

1,124,908,213

100.0%

190,939,407

98.9%

S/2/1

7,889,930

88.0%

479,056,419

100.4%

1,077,182,521

99.8%

346,392,996

100.7%

S/2/2

7,665,540

106.5%

388,469,845

99.3%

986,492,416

100.1%

322,305,928

100.1%

U/1/1

29,233,111

104.7%

320,187,107

99.7%

117,964,696

100.7%

30,654,065

97.4%

face_amount_band x dur_band1

face_amount_band

dur_band1: 01

dur_band1: 02

dur_band1: 03

dur_band1: 04-05

dur_band1: 06-15

dur_band1: 16-25

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

01 - 0 - 49,999

37,859,834

99.5%

46,746,236

99.3%

59,349,580

99.5%

129,333,013

100.4%

856,334,359

100.1%

3,064,186,050

100.0%

04 - 50,000 - 99,999

52,511,061

99.2%

66,787,618

100.7%

90,269,149

100.2%

206,424,919

99.9%

1,469,057,939

99.9%

3,397,973,888

100.0%

05 - 100,000 - 249,999

191,101,070

100.1%

269,652,985

100.1%

371,980,209

99.7%

855,544,316

100.2%

7,074,965,104

100.0%

7,477,763,920

100.0%

06 - 250,000 - 499,999

209,469,031

100.4%

304,453,835

100.2%

416,690,820

100.1%

1,000,438,105

99.9%

8,109,440,540

100.0%

4,705,111,802

100.1%

07 - 500,000 - 999,999

215,479,316

100.0%

350,555,118

100.2%

461,397,659

100.0%

1,167,345,786

100.2%

8,791,983,362

100.0%

4,193,982,866

100.0%

08 - 1,000,000+

562,715,983

100.1%

814,571,289

99.7%

1,094,470,475

100.0%

2,962,359,193

100.0%

29,375,708,984

100.0%

7,481,721,046

99.9%

face_amount_band x ia_band1

ia_band1

face_amount_band: 01 - 0 - 49,999

face_amount_band: 04 - 50,000 - 99,999

face_amount_band: 05 - 100,000 - 249,999

face_amount_band: 06 - 250,000 - 499,999

face_amount_band: 07 - 500,000 - 999,999

face_amount_band: 08 - 1,000,000+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

18-24

75,552,784

99.8%

133,231,819

100.0%

220,299,419

99.6%

115,618,404

100.1%

61,754,428

100.3%

61,474,227

100.3%

25-34

192,799,300

100.6%

458,412,108

99.7%

1,476,938,558

100.2%

1,398,298,659

99.7%

1,378,321,244

100.0%

1,362,757,946

100.0%

35-44

421,318,794

99.6%

898,534,810

100.2%

3,004,123,357

100.0%

3,258,823,688

100.1%

3,627,163,035

99.9%

5,780,489,453

100.0%

45-54

861,129,196

100.0%

1,234,583,452

100.0%

4,027,597,384

100.0%

3,776,175,782

100.0%

3,750,396,726

99.9%

6,979,627,775

100.0%

55-64

1,371,789,823

100.0%

1,302,388,605

100.0%

4,041,501,912

99.9%

3,254,795,121

100.1%

2,876,205,825

100.1%

6,600,470,804

99.9%

65-74

1,035,580,594

100.0%

950,356,588

99.9%

2,489,528,859

100.1%

1,929,056,738

100.0%

2,107,668,820

100.2%

8,096,645,917

100.0%

75-84

221,626,122

100.5%

285,327,929

99.8%

901,837,477

100.1%

915,794,848

99.9%

1,257,117,333

100.0%

12,302,308,241

100.0%

85-99

14,012,459

94.2%

20,189,263

101.4%

79,180,638

99.3%

97,040,893

100.9%

122,116,696

100.2%

1,107,772,607

100.0%

face_amount_band x gender

face_amount_band

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

01 - 0 - 49,999

2,082,160,405

100.0%

2,111,648,667

100.0%

04 - 50,000 - 99,999

2,075,208,647

100.1%

3,207,815,927

99.9%

05 - 100,000 - 249,999

5,592,847,725

100.0%

10,648,159,879

100.0%

06 - 250,000 - 499,999

4,715,664,348

99.9%

10,029,939,785

100.1%

07 - 500,000 - 999,999

4,435,784,376

100.2%

10,744,959,731

99.9%

08 - 1,000,000+

14,355,526,434

100.0%

27,936,020,536

100.0%

face_amount_band x insurance_plan

face_amount_band

insurance_plan: Other

insurance_plan: Perm

insurance_plan: Term

insurance_plan: xL

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

01 - 0 - 49,999

4,987,841

90.2%

2,341,324,220

99.9%

430,107,626

100.2%

1,417,389,385

100.1%

04 - 50,000 - 99,999

5,666,783

100.5%

1,269,348,289

100.1%

1,013,673,047

99.8%

2,994,336,455

100.0%

05 - 100,000 - 249,999

34,041,514

99.7%

2,315,213,211

100.0%

6,874,173,321

100.0%

7,017,579,558

100.0%

06 - 250,000 - 499,999

28,639,038

99.5%

1,510,214,779

100.0%

8,348,184,153

100.1%

4,858,566,163

99.9%

07 - 500,000 - 999,999

18,132,026

98.8%

1,434,005,349

100.2%

8,575,363,506

99.8%

5,153,243,226

100.2%

08 - 1,000,000+

95,885,098

99.9%

3,291,817,413

100.0%

13,782,114,400

100.0%

25,121,730,059

100.0%

face_amount_band x ltp

ltp

face_amount_band: 01 - 0 - 49,999

face_amount_band: 04 - 50,000 - 99,999

face_amount_band: 05 - 100,000 - 249,999

face_amount_band: 06 - 250,000 - 499,999

face_amount_band: 07 - 500,000 - 999,999

face_amount_band: 08 - 1,000,000+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

5 yr

40,106,609

101.5%

108,758,076

99.7%

223,852,354

99.8%

134,040,438

100.0%

88,356,037

100.1%

91,230,000

100.3%

10 yr

79,471,212

100.1%

152,307,148

100.5%

1,212,386,424

100.0%

1,312,587,882

99.9%

1,417,042,517

99.9%

3,097,869,540

100.0%

15 yr

63,154,393

99.8%

171,454,282

99.3%

1,115,915,003

100.1%

1,284,208,527

100.2%

1,254,669,304

99.8%

2,128,355,051

99.9%

20 yr

113,505,987

100.6%

278,179,336

99.8%

2,893,988,498

99.9%

3,894,785,171

100.1%

4,168,688,693

99.9%

6,417,900,922

100.0%

25 yr

9,580,391

103.9%

26,792,196

100.0%

154,206,000

100.5%

162,940,708

100.3%

101,887,627

100.5%

69,816,000

98.7%

30 yr

6,329,793

102.1%

31,267,129

96.9%

436,286,383

100.6%

721,667,111

99.6%

809,282,434

100.1%

1,087,898,935

100.3%

Not Level Term

3,802,855,700

100.0%

4,317,239,186

100.0%

9,477,900,932

100.0%

6,615,106,546

100.0%

6,758,480,601

100.1%

28,642,732,570

100.0%

Unknown

78,804,987

99.8%

197,027,221

99.9%

726,472,010

100.4%

620,267,750

99.9%

582,336,894

99.8%

755,743,952

99.9%

face_amount_band x iy_band1

face_amount_band

iy_band1: 1900-1989

iy_band1: 1990-1999

iy_band1: 2000-2009

iy_band1: 2010+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

01 - 0 - 49,999

267,683,109

99.9%

2,808,814,596

100.0%

833,122,239

100.1%

284,189,128

99.9%

04 - 50,000 - 99,999

267,216,839

100.1%

3,149,056,719

99.9%

1,444,089,389

100.1%

422,661,627

100.0%

05 - 100,000 - 249,999

404,345,538

99.7%

7,205,856,453

100.0%

6,896,526,088

100.0%

1,734,279,525

100.0%

06 - 250,000 - 499,999

158,023,865

100.0%

4,708,333,873

100.1%

7,919,827,744

100.0%

1,959,418,651

99.9%

07 - 500,000 - 999,999

136,967,257

99.5%

4,219,050,921

100.1%

8,611,625,509

99.9%

2,213,100,420

100.3%

08 - 1,000,000+

229,942,906

100.5%

7,453,034,784

99.9%

29,340,102,339

100.0%

5,268,466,941

99.9%

dur_band1 x ia_band1

ia_band1

dur_band1: 01

dur_band1: 02

dur_band1: 03

dur_band1: 04-05

dur_band1: 06-15

dur_band1: 16-25

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

18-24

38,014,843

100.7%

30,257,763

100.6%

27,602,930

98.2%

48,778,652

99.3%

247,658,415

100.1%

275,618,478

99.9%

25-34

134,511,109

99.4%

163,517,330

99.4%

194,704,716

100.9%

424,147,771

100.0%

2,807,372,903

100.0%

2,543,273,986

99.9%

35-44

290,877,481

100.1%

372,345,417

100.5%

502,388,214

99.8%

1,120,149,604

100.2%

8,625,261,989

100.0%

6,079,430,432

100.0%

45-54

284,754,759

100.3%

443,414,694

99.4%

668,187,442

99.7%

1,648,875,363

100.2%

10,888,781,617

100.1%

6,695,496,440

99.9%

55-64

316,638,372

100.2%

537,383,357

100.0%

610,988,925

100.0%

1,580,728,594

100.0%

9,237,741,705

99.8%

7,163,671,137

100.2%

65-74

143,303,417

99.5%

240,017,175

100.9%

343,292,965

100.4%

930,432,092

99.4%

8,689,000,890

100.1%

6,262,790,977

100.0%

75-84

45,633,095

99.5%

49,322,782

97.8%

121,205,864

100.0%

506,866,160

100.7%

13,883,785,714

99.9%

1,277,198,335

100.5%

85-99

15,403,219

106.1%

16,508,563

98.1%

25,786,836

97.1%

61,467,096

98.8%

1,297,887,055

100.9%

23,259,787

69.4%

dur_band1 x gender

dur_band1

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

01

322,942,506

99.9%

946,193,789

100.1%

02

424,637,814

100.3%

1,428,129,267

99.8%

03

697,046,249

100.3%

1,797,111,643

99.8%

04-05

1,788,538,549

100.0%

4,532,906,783

100.1%

06-15

19,652,776,350

100.0%

36,024,713,938

100.0%

16-25

10,371,250,467

100.0%

19,949,489,105

100.0%

dur_band1 x insurance_plan

dur_band1

insurance_plan: Other

insurance_plan: Perm

insurance_plan: Term

insurance_plan: xL

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

01

6,987,296

100.6%

115,387,649

99.4%

816,300,396

100.2%

330,460,954

100.1%

02

12,956,089

100.9%

186,759,670

99.8%

1,193,573,840

100.0%

459,477,482

99.9%

03

10,940,700

97.9%

173,260,241

100.2%

1,539,568,218

100.0%

770,388,733

99.8%

04-05

41,382,619

100.8%

351,202,533

99.6%

3,960,972,351

100.0%

1,967,887,829

100.2%

06-15

92,001,723

98.7%

2,251,458,889

100.1%

24,546,892,971

100.0%

28,787,136,705

100.0%

16-25

23,083,873

99.3%

9,083,854,279

100.0%

6,966,308,277

99.9%

14,247,493,143

100.0%

dur_band1 x ltp

ltp

dur_band1: 01

dur_band1: 02

dur_band1: 03

dur_band1: 04-05

dur_band1: 06-15

dur_band1: 16-25

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

5 yr

8,474,532

101.0%

5,768,265

89.9%

13,120,000

99.3%

47,280,625

100.7%

258,887,272

100.3%

352,812,820

99.9%

10 yr

250,039,805

100.0%

374,445,553

100.4%

463,240,176

99.7%

1,216,385,768

100.0%

4,556,704,077

100.0%

410,849,344

99.6%

15 yr

106,613,913

99.4%

154,795,986

100.0%

219,264,610

100.9%

520,402,972

99.3%

4,466,179,563

100.0%

550,499,516

100.2%

20 yr

314,198,709

100.6%

447,558,227

100.0%

600,176,032

99.4%

1,543,338,839

100.2%

11,599,090,681

100.0%

3,262,686,119

99.9%

25 yr

3,615,999

76.5%

13,707,800

112.5%

18,921,628

102.6%

49,784,528

99.8%

267,598,048

99.8%

171,594,919

100.6%

30 yr

66,053,409

101.4%

98,748,126

97.3%

130,459,928

101.9%

388,766,012

100.3%

1,983,373,230

100.1%

425,331,080

100.1%

Not Level Term

453,235,980

99.9%

659,812,994

99.9%

954,906,252

99.9%

2,361,669,130

100.1%

31,616,749,475

100.0%

23,567,941,704

100.0%

Unknown

66,903,948

100.2%

97,930,130

100.1%

94,069,266

100.3%

193,817,458

100.3%

928,907,942

100.2%

1,579,024,070

99.8%

dur_band1 x iy_band1

dur_band1

iy_band1: 2010+

iy_band1: 2000-2009

iy_band1: 1990-1999

iy_band1: 1900-1989

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

01

1,269,136,295

100.1%

02

1,852,767,081

100.0%

03

2,494,157,892

99.9%

04-05

4,479,996,408

100.1%

1,841,448,924

99.9%

06-15

1,786,058,616

99.7%

51,370,835,971

100.0%

2,520,595,701

99.9%

16-25

1,833,008,413

99.7%

27,023,551,645

100.0%

1,464,179,514

100.0%

ia_band1 x gender

ia_band1

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

18-24

210,351,040

100.2%

457,580,041

99.8%

25-34

1,919,168,906

99.9%

4,348,358,909

100.0%

35-44

4,376,241,525

100.0%

12,614,211,612

100.0%

45-54

4,611,772,525

100.0%

16,017,737,790

100.0%

55-64

4,809,017,158

100.1%

14,638,134,932

99.9%

65-74

7,381,394,530

99.9%

9,227,442,986

100.1%

75-84

9,045,279,898

100.0%

6,838,732,052

100.0%

85-99

903,966,353

100.2%

536,346,203

99.8%

ia_band1 x insurance_plan

ia_band1

insurance_plan: Other

insurance_plan: Perm

insurance_plan: Term

insurance_plan: xL

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

18-24

708,665

69.0%

166,230,095

99.8%

267,466,700

100.2%

233,525,621

99.8%

25-34

3,256,794

74.4%

852,327,846

100.2%

4,057,228,956

99.9%

1,354,714,219

100.1%

35-44

21,307,402

99.6%

1,846,365,837

100.1%

11,715,945,894

100.0%

3,406,834,004

99.8%

45-54

57,525,597

103.6%

2,565,363,967

99.8%

12,641,031,673

100.0%

5,365,589,078

100.2%

55-64

47,144,955

97.2%

3,306,501,193

100.2%

8,357,779,097

100.0%

7,735,726,845

99.9%

65-74

35,749,661

102.3%

2,847,699,847

99.9%

1,896,982,781

99.9%

11,828,405,227

100.1%

75-84

21,392,494

100.5%

492,087,214

99.4%

86,369,648

101.2%

15,284,162,594

100.0%

85-99

266,732

19.4%

85,347,262

105.1%

811,304

75.2%

1,353,887,258

99.8%

ia_band1 x ltp

ia_band1

ltp: 5 yr

ltp: 10 yr

ltp: 15 yr

ltp: 20 yr

ltp: 25 yr

ltp: 30 yr

ltp: Not Level Term

ltp: Unknown

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

18-24

17,726,831

97.1%

32,009,383

100.5%

3,343,000

83.0%

80,676,936

101.0%

13,002,500

96.2%

48,824,425

99.6%

403,156,010

99.9%

69,191,996

101.5%

25-34

147,704,951

100.5%

317,080,031

99.8%

150,181,284

99.2%

1,670,129,784

100.1%

164,417,165

99.9%

748,387,532

99.8%

2,274,993,951

100.0%

794,633,117

99.9%

35-44

232,956,798

99.9%

1,049,894,842

99.9%

983,755,234

99.9%

6,112,699,729

100.1%

222,687,940

99.1%

1,620,869,320

100.1%

5,508,205,763

100.0%

1,259,383,511

100.0%

45-54

178,555,603

100.2%

2,312,649,990

99.9%

2,156,659,786

100.1%

6,421,617,312

100.0%

95,336,910

101.6%

652,477,911

100.3%

8,212,889,839

100.0%

599,322,964

99.9%

55-64

76,414,209

99.1%

2,489,338,251

100.0%

2,056,964,977

100.0%

3,341,836,281

99.9%

27,176,407

109.8%

22,072,597

104.3%

11,252,197,601

100.0%

181,151,767

101.6%

65-74

27,015,122

90.2%

1,014,083,731

100.2%

662,777,279

100.2%

140,088,565

98.4%

2,602,000

106.3%

100,000

43.9%

14,725,682,019

100.0%

36,488,800

93.5%

75-84

5,970,000

241.6%

56,608,495

97.6%

4,065,000

71.7%

0

0.0%

15,797,689,100

100.0%

19,679,355

104.3%

85-99

0

0.0%

10,000

9460.3%

1,439,501,252

100.0%

801,304

75.5%

ia_band1 x iy_band1

ia_band1

iy_band1: 1900-1989

iy_band1: 1990-1999

iy_band1: 2000-2009

iy_band1: 2010+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

18-24

21,957,490

100.8%

248,374,326

99.9%

249,952,870

99.9%

147,646,395

100.1%

25-34

156,424,882

99.6%

2,415,663,468

100.0%

2,774,706,986

100.0%

920,732,479

99.8%

35-44

323,624,196

100.6%

5,875,528,027

99.9%

8,554,094,758

100.0%

2,237,206,156

100.0%

45-54

340,733,447

98.7%

6,605,000,254

99.9%

10,642,985,328

100.1%

3,040,791,286

100.0%

55-64

443,897,049

100.4%

6,912,220,124

100.1%

8,972,404,806

99.9%

3,118,630,111

99.9%

65-74

172,472,214

99.7%

6,127,207,226

100.1%

8,704,756,697

99.9%

1,604,401,379

100.2%

75-84

5,070,236

132.1%

1,325,459,599

100.2%

13,867,379,988

100.0%

686,102,127

100.1%

85-99

0

0.0%

34,694,322

85.6%

1,279,011,875

100.8%

126,606,359

97.3%

gender x insurance_plan

insurance_plan

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

Other

76,199,638

98.3%

111,152,662

100.2%

Perm

4,259,680,761

100.1%

7,902,242,500

100.0%

Term

8,746,195,310

100.1%

30,277,420,743

100.0%

xL

20,175,116,226

100.0%

26,387,728,620

100.0%

gender x ltp

ltp

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

5 yr

233,647,378

99.9%

452,696,136

100.1%

10 yr

1,294,444,561

100.2%

5,977,220,162

99.9%

15 yr

1,152,920,343

99.7%

4,864,836,217

100.0%

20 yr

4,067,960,544

100.1%

13,699,088,063

100.0%

25 yr

164,969,732

100.1%

360,253,190

100.3%

30 yr

911,017,939

100.0%

2,181,713,846

100.1%

Not Level Term

24,668,222,437

100.0%

34,946,093,098

100.0%

Unknown

764,009,001

100.2%

2,196,643,813

99.9%

gender x iy_band1

iy_band1

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

1900-1989

415,833,187

100.0%

1,048,346,327

100.0%

1990-1999

10,124,056,124

100.0%

19,420,091,222

100.0%

2000-2009

19,505,796,137

100.0%

35,539,497,171

100.0%

2010+

3,211,506,487

100.0%

8,670,609,805

100.0%

insurance_plan x ltp

ltp

insurance_plan: Term

insurance_plan: Other

insurance_plan: Perm

insurance_plan: xL

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

5 yr

686,343,514

100.0%

10 yr

7,271,664,723

100.0%

15 yr

6,017,756,560

100.0%

20 yr

17,767,048,607

100.0%

25 yr

525,222,922

100.3%

30 yr

3,092,731,785

100.1%

Not Level Term

702,195,128

99.0%

187,352,300

99.4%

12,161,923,261

100.0%

46,562,844,846

100.0%

Unknown

2,960,652,814

100.0%

insurance_plan x iy_band1

insurance_plan

iy_band1: 1900-1989

iy_band1: 1990-1999

iy_band1: 2000-2009

iy_band1: 2010+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Other

699,086

11.5%

24,035,150

129.3%

95,671,569

99.6%

66,946,495

98.9%

Perm

685,297,227

100.6%

8,414,471,562

100.0%

2,230,633,550

100.0%

831,520,922

99.8%

Term

110,349,436

98.8%

7,507,786,831

100.0%

24,023,250,728

100.0%

7,382,229,058

100.0%

xL

667,833,765

100.3%

13,597,853,803

100.0%

28,695,737,461

100.0%

3,601,419,817

100.0%

ltp x iy_band1

ltp

iy_band1: 1900-1989

iy_band1: 1990-1999

iy_band1: 2000-2009

iy_band1: 2010+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

5 yr

11,828,904

91.8%

356,634,986

100.4%

256,413,452

100.3%

61,466,172

98.7%

10 yr

7,313,258

93.4%

395,337,236

100.0%

4,556,557,084

100.0%

2,312,457,145

100.0%

15 yr

770,000

52.6%

786,184,143

99.9%

4,222,555,938

100.0%

1,008,246,479

100.0%

20 yr

27,216,551

110.5%

3,469,245,630

100.0%

11,456,767,739

100.0%

2,813,818,687

99.9%

25 yr

0

0.0%

162,532,838

99.5%

278,353,329

101.1%

84,336,755

99.1%

30 yr

0

0.0%

553,191,034

100.5%

1,883,712,240

99.9%

655,828,511

100.4%

Not Level Term

1,355,860,609

100.0%

22,279,426,292

100.0%

31,476,742,520

100.0%

4,502,286,114

100.0%

Unknown

61,190,192

98.3%

1,541,595,187

100.0%

914,191,006

100.1%

443,676,429

100.1%

Elastic Net GLM

Background

Elastic net regularization allows the modeler to combine both LASSO and ridge penalties into a single model.

As one may recall, ordinary least squares regression requires minimizing the squared difference of the response variable and the predicted values. In symbols,

\[ \underset{\beta}{\arg\min} \sum_{n}(y-X\beta)^{2} \]

This is equivalent to maximum likelihood estimation, where one assumes that the response variable \(y\) is normally distributed with mean \(X\beta\) and variance \(\sigma^{2}I_{k x k}\). The maximum is taken with respect to \(\beta\), and the variance parameter is assumed to be fixed but unknown.

The LASSO and ridge regression methods each add an additional penalty term on the coefficients \(\beta\). The LASSO adds the sum of the absolute values of the parameters \(\beta\) subject to a tunable weight, \(\lambda\). This term incentivizes the fitting algorithm to fit toward parameter values close to 0.

\[ \underset{\beta}{\arg\min} \sum_{n}(y-X\beta)^{2}+ \lambda\sum_{k}|\beta_{k}|\]

The ridge penalty adds the sum of the squares of the parameters \(\beta\), subject to a tunable weight, \(\alpha\). This term also incentivizes the fitting algorithm to fit toward parameter values close to 0.

\[ \underset{\beta}{\arg\min} \sum_{n}(y-X\beta)^{2}+ \alpha\sum_{k}\beta_{k}^{2}\]

What may be new to some readers is that in both cases, for special \(\lambda\) or \(\alpha\), the minimizers of these expressions correspond to the Bayesian maximum a posteriori (MAP) estimators for specific prior distributions for \(\beta\). In the ridge case, the prior is the normal distribution with mean 0 and covariance \(\tau^{2} I_{k \times k}\) for some assumed \(\tau^{2}\).

For the LASSO, the prior is the double-exponential or Laplace distribution with mean 0 and parameter \(\tau\).

In either case, it can be shown that if \(\sigma^{2}\) and \(\tau\) are known, the penalizing weights have unique solutions and are equivalent to the \(k\) term in Bühlmann credibility. In practice, the penalizing weights are unknown and must be tuned. The resulting optimal \(\beta\) is also credible from a Bayesian perspective. Moreover, it can be shown that these facts carry over to the GLM case.

Data Preparation

Elastic net GLMs as implemented in the glmnet package require that the inputs be converted to model matrices.

Model Fitting

Once the data are set up, we can calibrate a LASSO penalty, lambda, using n-fold cross-validation.

## In this section we fit a penalized linear regression model using glmnet. 
## We first fine-tune hyperparameter alpha using cross-validation
## Then we refit the model using the optimal alpha

#-----------------------------------------#
##### Perform Cross-Validation #####
#-----------------------------------------#
## Load or fit the cross-validated glmnet model

## Check if the cached model exists and is valid
if (bUseCache & file.exists(paste0(cacheFileRoot, "_glmnet_int_cv_model.rds")) & !bInvalidateCaches) {
  ## Load the cached model if it exists
  cvfit <- readRDS(paste0(cacheFileRoot, "_glmnet_int_cv_model.rds"))
} else {
  ## Initialize a cluster for parallel processing
  cl <- makeCluster(nGLMNetCores)
  registerDoParallel(cl)
  
  ## Set glmnet control options, enable iteration trace if debugging
  glmnet.control(itrace = ifelse(bDebug, 1, 0))
  
  ## Set random seed for reproducibility
  set.seed(nELSeed)
  
  ## Perform cross-validated glmnet model fitting
  cvfit <- cv.glmnet(
    train.x.net, 
    train.y.net, 
    offset = log(train.offset.net), 
    family = "poisson", 
    alpha = fGLMNetAlpha,
    parallel = TRUE
  )
  
  ## Stop the parallel cluster
  stopCluster(cl)
  
  ## Save the model to cache if caching is enabled
  if (bUseCache) {
    saveRDS(cvfit, paste0(cacheFileRoot, "_glmnet_int_cv_model.rds"))
  }
}

## Generate predictions for the test data (handling offsets correctly)
test[, predictions_glmnet := predict(
  cvfit,
  s = "lambda.min",
  newx = test.x.net, 
  type = "response", 
  newoffset = log(test.offset.net)
)]

## Generate predictions for the training data (handling offsets correctly)
train[, predictions_glmnet := predict(
  cvfit,
  s = "lambda.min",
  newx = train.x.net, 
  type = "response", 
  newoffset = log(train.offset.net)
)]

Model Plots and Tables

An important by-product of the n-fold cross validation is the plot of \(\lambda\) values. The optimal choice of \(\lambda\) is the lowest, and the model associated with that \(\lambda\) is the final model.

One can also plot the trajectory of coefficients as progressively higher \(\lambda\) imposes ever harsher penalties on the coefficients.

Lambda Plot
## plot lambda vs deviance

data.table(lambda=cvfit$lambda,
           cvm=cvfit$cvm,
           cvlo=cvfit$cvlo,
           cvup=cvfit$cvup,
           nzero=cvfit$nzero) %>%
  ggplot(aes(x=log(lambda))) +
  geom_point(aes(y=cvm),color="red") +
  geom_errorbar(aes(ymin=cvlo,ymax=cvup),color="grey") +
  scale_y_continuous(name=cvfit$name,
                     limits = range(cvfit$cvup,cvfit$cvlo)) +
  scale_x_continuous(name=expression(Log(lambda))) +
  geom_vline(xintercept=log(cvfit$lambda.min),linetype=3) +
  geom_vline(xintercept = log(cvfit$lambda.1se),linetype=3) +
  theme_minimal()

Coefficient Penalization
plot(cvfit$glmnet.fit,xvar="lambda")

Final Model
#---------------------------------------------------------------#
##### Examine model for minimum lambda and make predictions #####
#---------------------------------------------------------------#

## get coefficients for fitted version, reformatted and filtered
reformatCoefs(cvfit, pred_cols)  %>%
  filter(Coef != 0) %>%
  select(Feature1Name,
         Feature1Level,
         Feature2Name,
         Feature2Level,
         Coef) %>%
  mutate(Coef=exp(Coef)) %>%
  flextable() %>%
  highlight(j="Coef",color="yellow",part="body", i = ~ abs(log(Coef)) > log(1.05)) %>%
  set_formatter(
    Coef=function(x) paste0(sprintf("%.01f", 100*x),"%")
  ) %>%
  set_table_properties(opts_html=list(
        scroll=list(
          add_css="max-height: 500px;"
          )
        )
        ) %>%
  autofit() ->
  output_table

if(output_format == "html") {
  output_table
} else {
  wb <- wb_workbook()
  
  wb$add_worksheet("GLMNet Coefficients")
  
  wb <- wb_add_flextable(
    wb,
    "GLMNet Coefficients",
    output_table
  )
  
  wb$save(
    paste0(
      exportsRoot,
      "_glmnet_coefficients_table.xlsx"
    )
  )
  
  cat("See included Excel table for additional information.\n")
}

Feature1Name

Feature1Level

Feature2Name

Feature2Level

Coef

(Intercept)

(Intercept)

127.5%

uw

N/2/2

103.5%

uw

N/4/1

87.4%

uw

N/4/2

91.0%

face_amount_band

05 - 100,000 - 249,999

86.4%

face_amount_band

06 - 250,000 - 499,999

86.9%

face_amount_band

07 - 500,000 - 999,999

93.4%

face_amount_band

08 - 1,000,000+

96.3%

gender

M

106.9%

insurance_plan

xL

100.1%

uw

N/2/2

ia_band1

25-34

109.2%

uw

N/3/1

ia_band1

25-34

101.0%

uw

N/3/2

ia_band1

25-34

100.9%

uw

N/3/3

ia_band1

25-34

122.3%

uw

N/4/3

ia_band1

25-34

114.0%

uw

N/4/4

ia_band1

25-34

108.8%

uw

S/1/1

ia_band1

25-34

97.3%

uw

S/2/1

ia_band1

25-34

95.3%

uw

N/2/1

ia_band1

35-44

98.1%

uw

N/2/2

ia_band1

35-44

105.0%

uw

N/3/1

ia_band1

35-44

100.1%

uw

N/3/2

ia_band1

35-44

101.9%

uw

N/3/3

ia_band1

35-44

131.6%

uw

N/4/1

ia_band1

35-44

100.8%

uw

N/4/2

ia_band1

35-44

103.5%

uw

N/4/4

ia_band1

35-44

112.2%

uw

S/2/1

ia_band1

35-44

95.2%

uw

U/1/1

ia_band1

35-44

107.5%

uw

N/2/1

ia_band1

45-54

98.3%

uw

N/2/2

ia_band1

45-54

109.5%

uw

N/3/2

ia_band1

45-54

97.7%

uw

N/3/3

ia_band1

45-54

119.0%

uw

N/4/2

ia_band1

45-54

99.8%

uw

N/4/3

ia_band1

45-54

109.5%

uw

N/4/4

ia_band1

45-54

114.8%

uw

S/2/2

ia_band1

45-54

105.3%

uw

U/1/1

ia_band1

45-54

101.8%

uw

N/2/1

ia_band1

55-64

91.6%

uw

N/3/1

ia_band1

55-64

90.8%

uw

N/3/2

ia_band1

55-64

86.5%

uw

N/3/3

ia_band1

55-64

105.9%

uw

N/4/1

ia_band1

55-64

82.6%

uw

N/4/2

ia_band1

55-64

94.0%

uw

S/1/1

ia_band1

55-64

109.4%

uw

S/2/1

ia_band1

55-64

98.2%

uw

S/2/2

ia_band1

55-64

92.1%

uw

U/1/1

ia_band1

55-64

101.6%

uw

N/3/1

ia_band1

65-74

88.6%

uw

N/3/2

ia_band1

65-74

85.9%

uw

N/3/3

ia_band1

65-74

97.1%

uw

N/4/1

ia_band1

65-74

85.5%

uw

N/4/2

ia_band1

65-74

80.8%

uw

N/4/3

ia_band1

65-74

93.0%

uw

S/2/2

ia_band1

65-74

103.0%

uw

U/1/1

ia_band1

65-74

98.2%

uw

N/2/1

ia_band1

75-84

113.4%

uw

N/2/2

ia_band1

75-84

98.0%

uw

N/3/1

ia_band1

75-84

92.9%

uw

N/3/2

ia_band1

75-84

82.8%

uw

N/3/3

ia_band1

75-84

83.1%

uw

N/4/1

ia_band1

75-84

51.1%

uw

N/4/2

ia_band1

75-84

137.2%

uw

N/4/3

ia_band1

75-84

89.5%

uw

N/4/4

ia_band1

75-84

90.6%

uw

S/1/1

ia_band1

75-84

119.0%

uw

S/2/2

ia_band1

75-84

96.8%

uw

N/2/1

ia_band1

85-99

191.0%

uw

N/2/2

ia_band1

85-99

101.0%

uw

N/3/2

ia_band1

85-99

113.0%

uw

N/3/3

ia_band1

85-99

83.4%

uw

N/4/2

ia_band1

85-99

89.0%

uw

N/4/3

ia_band1

85-99

83.3%

uw

N/4/4

ia_band1

85-99

88.1%

uw

S/2/1

ia_band1

85-99

141.8%

uw

S/2/2

ia_band1

85-99

169.5%

uw

N/2/2

face_amount_band

04 - 50,000 - 99,999

100.8%

uw

N/3/3

face_amount_band

04 - 50,000 - 99,999

99.7%

uw

N/2/2

face_amount_band

05 - 100,000 - 249,999

101.0%

uw

N/4/1

face_amount_band

05 - 100,000 - 249,999

95.3%

uw

N/4/2

face_amount_band

05 - 100,000 - 249,999

96.2%

uw

S/2/1

face_amount_band

05 - 100,000 - 249,999

99.5%

uw

N/4/1

face_amount_band

06 - 250,000 - 499,999

98.9%

uw

N/4/3

face_amount_band

06 - 250,000 - 499,999

95.0%

uw

N/2/1

face_amount_band

07 - 500,000 - 999,999

99.3%

uw

N/2/2

face_amount_band

07 - 500,000 - 999,999

99.2%

uw

N/3/2

face_amount_band

07 - 500,000 - 999,999

99.8%

uw

N/3/3

face_amount_band

07 - 500,000 - 999,999

102.9%

uw

N/4/1

face_amount_band

07 - 500,000 - 999,999

98.4%

uw

N/4/3

face_amount_band

07 - 500,000 - 999,999

105.1%

uw

N/4/4

face_amount_band

07 - 500,000 - 999,999

106.6%

uw

S/1/1

face_amount_band

07 - 500,000 - 999,999

95.5%

uw

S/2/1

face_amount_band

07 - 500,000 - 999,999

100.9%

uw

S/2/2

face_amount_band

07 - 500,000 - 999,999

116.0%

uw

N/2/1

face_amount_band

08 - 1,000,000+

91.6%

uw

N/2/2

face_amount_band

08 - 1,000,000+

104.4%

uw

N/3/1

face_amount_band

08 - 1,000,000+

99.1%

uw

N/3/2

face_amount_band

08 - 1,000,000+

103.8%

uw

N/3/3

face_amount_band

08 - 1,000,000+

114.7%

uw

N/4/1

face_amount_band

08 - 1,000,000+

103.4%

uw

N/4/2

face_amount_band

08 - 1,000,000+

98.0%

uw

N/4/3

face_amount_band

08 - 1,000,000+

103.9%

uw

N/4/4

face_amount_band

08 - 1,000,000+

119.8%

uw

S/2/1

face_amount_band

08 - 1,000,000+

97.7%

uw

U/1/1

face_amount_band

08 - 1,000,000+

110.8%

face_amount_band

04 - 50,000 - 99,999

ia_band1

25-34

102.5%

face_amount_band

05 - 100,000 - 249,999

ia_band1

25-34

92.2%

face_amount_band

06 - 250,000 - 499,999

ia_band1

25-34

87.3%

face_amount_band

07 - 500,000 - 999,999

ia_band1

25-34

91.1%

face_amount_band

08 - 1,000,000+

ia_band1

25-34

87.3%

face_amount_band

04 - 50,000 - 99,999

ia_band1

35-44

107.4%

face_amount_band

05 - 100,000 - 249,999

ia_band1

35-44

97.5%

face_amount_band

06 - 250,000 - 499,999

ia_band1

35-44

93.3%

face_amount_band

07 - 500,000 - 999,999

ia_band1

35-44

92.0%

face_amount_band

08 - 1,000,000+

ia_band1

35-44

93.9%

face_amount_band

05 - 100,000 - 249,999

ia_band1

45-54

93.1%

face_amount_band

06 - 250,000 - 499,999

ia_band1

45-54

88.5%

face_amount_band

07 - 500,000 - 999,999

ia_band1

45-54

89.0%

face_amount_band

08 - 1,000,000+

ia_band1

45-54

90.1%

face_amount_band

04 - 50,000 - 99,999

ia_band1

55-64

98.1%

face_amount_band

05 - 100,000 - 249,999

ia_band1

65-74

106.4%

face_amount_band

06 - 250,000 - 499,999

ia_band1

65-74

109.3%

face_amount_band

07 - 500,000 - 999,999

ia_band1

65-74

112.6%

face_amount_band

08 - 1,000,000+

ia_band1

65-74

102.4%

face_amount_band

04 - 50,000 - 99,999

ia_band1

75-84

97.7%

face_amount_band

05 - 100,000 - 249,999

ia_band1

75-84

103.7%

face_amount_band

06 - 250,000 - 499,999

ia_band1

75-84

108.3%

face_amount_band

07 - 500,000 - 999,999

ia_band1

75-84

112.0%

face_amount_band

07 - 500,000 - 999,999

ia_band1

85-99

102.8%

face_amount_band

08 - 1,000,000+

ia_band1

85-99

85.5%

face_amount_band

08 - 1,000,000+

dur_band1

02

97.8%

face_amount_band

08 - 1,000,000+

dur_band1

03

99.0%

face_amount_band

05 - 100,000 - 249,999

dur_band1

04-05

98.4%

face_amount_band

06 - 250,000 - 499,999

dur_band1

04-05

98.5%

face_amount_band

05 - 100,000 - 249,999

dur_band1

06-15

96.8%

face_amount_band

07 - 500,000 - 999,999

dur_band1

06-15

97.2%

face_amount_band

08 - 1,000,000+

dur_band1

06-15

93.4%

face_amount_band

04 - 50,000 - 99,999

dur_band1

16-25

96.3%

face_amount_band

05 - 100,000 - 249,999

dur_band1

16-25

97.8%

face_amount_band

07 - 500,000 - 999,999

dur_band1

16-25

101.3%

face_amount_band

08 - 1,000,000+

dur_band1

16-25

106.4%

uw

N/2/1

dur_band1

02

143.5%

uw

N/3/2

dur_band1

02

102.1%

uw

N/4/1

dur_band1

02

81.4%

uw

N/4/3

dur_band1

02

113.4%

uw

N/2/2

dur_band1

03

117.3%

uw

N/3/1

dur_band1

03

107.0%

uw

N/4/1

dur_band1

03

98.9%

uw

N/4/2

dur_band1

03

84.4%

uw

N/4/4

dur_band1

03

102.6%

uw

N/2/1

dur_band1

04-05

81.0%

uw

N/3/1

dur_band1

04-05

96.9%

uw

N/3/3

dur_band1

04-05

106.8%

uw

N/4/3

dur_band1

04-05

99.0%

uw

S/2/1

dur_band1

04-05

97.7%

uw

N/2/1

dur_band1

06-15

96.5%

uw

N/2/2

dur_band1

06-15

100.6%

uw

N/3/1

dur_band1

06-15

105.4%

uw

N/3/2

dur_band1

06-15

103.3%

uw

N/4/3

dur_band1

06-15

107.4%

uw

N/4/4

dur_band1

06-15

101.4%

uw

S/2/1

dur_band1

06-15

88.8%

uw

U/1/1

dur_band1

06-15

102.5%

uw

N/2/1

dur_band1

16-25

100.5%

uw

N/2/2

dur_band1

16-25

101.7%

uw

N/3/1

dur_band1

16-25

97.1%

uw

N/3/2

dur_band1

16-25

90.9%

uw

N/3/3

dur_band1

16-25

88.1%

uw

N/4/2

dur_band1

16-25

97.2%

uw

N/4/4

dur_band1

16-25

81.6%

uw

S/1/1

dur_band1

16-25

108.2%

uw

S/2/2

dur_band1

16-25

103.8%

uw

U/1/1

dur_band1

16-25

101.3%

dur_band1

04-05

ia_band1

25-34

100.1%

dur_band1

16-25

ia_band1

25-34

99.3%

dur_band1

02

ia_band1

35-44

100.3%

dur_band1

03

ia_band1

35-44

101.9%

dur_band1

04-05

ia_band1

35-44

91.9%

dur_band1

06-15

ia_band1

35-44

93.6%

dur_band1

16-25

ia_band1

35-44

97.3%

dur_band1

02

ia_band1

45-54

97.8%

dur_band1

03

ia_band1

45-54

101.7%

dur_band1

16-25

ia_band1

45-54

102.9%

dur_band1

02

ia_band1

55-64

109.6%

dur_band1

03

ia_band1

55-64

92.9%

dur_band1

04-05

ia_band1

55-64

97.1%

dur_band1

06-15

ia_band1

55-64

100.0%

dur_band1

16-25

ia_band1

55-64

112.3%

dur_band1

04-05

ia_band1

65-74

101.5%

dur_band1

06-15

ia_band1

65-74

96.6%

dur_band1

16-25

ia_band1

65-74

103.2%

dur_band1

02

ia_band1

75-84

87.6%

dur_band1

04-05

ia_band1

75-84

113.2%

dur_band1

16-25

ia_band1

75-84

99.4%

dur_band1

04-05

ia_band1

85-99

82.9%

dur_band1

06-15

ia_band1

85-99

99.0%

dur_band1

16-25

ia_band1

85-99

56.1%

face_amount_band

04 - 50,000 - 99,999

gender

M

103.3%

face_amount_band

06 - 250,000 - 499,999

gender

M

93.3%

face_amount_band

07 - 500,000 - 999,999

gender

M

90.8%

face_amount_band

08 - 1,000,000+

gender

M

89.6%

dur_band1

04-05

ltp

10 yr

98.6%

dur_band1

06-15

ltp

10 yr

103.9%

dur_band1

16-25

ltp

10 yr

123.1%

dur_band1

03

ltp

15 yr

100.6%

dur_band1

04-05

ltp

15 yr

95.7%

dur_band1

06-15

ltp

15 yr

95.9%

dur_band1

16-25

ltp

15 yr

222.7%

dur_band1

06-15

ltp

20 yr

90.9%

dur_band1

16-25

ltp

20 yr

82.7%

dur_band1

04-05

ltp

25 yr

100.3%

dur_band1

06-15

ltp

25 yr

94.1%

dur_band1

16-25

ltp

25 yr

97.0%

dur_band1

04-05

ltp

30 yr

112.2%

dur_band1

16-25

ltp

30 yr

92.7%

dur_band1

04-05

ltp

Not Level Term

98.6%

face_amount_band

05 - 100,000 - 249,999

ltp

10 yr

99.1%

face_amount_band

06 - 250,000 - 499,999

ltp

10 yr

100.0%

face_amount_band

07 - 500,000 - 999,999

ltp

10 yr

103.0%

face_amount_band

08 - 1,000,000+

ltp

10 yr

98.0%

face_amount_band

04 - 50,000 - 99,999

ltp

15 yr

97.3%

face_amount_band

05 - 100,000 - 249,999

ltp

15 yr

92.0%

face_amount_band

08 - 1,000,000+

ltp

15 yr

102.2%

face_amount_band

05 - 100,000 - 249,999

ltp

20 yr

95.6%

face_amount_band

06 - 250,000 - 499,999

ltp

20 yr

98.7%

face_amount_band

07 - 500,000 - 999,999

ltp

20 yr

97.5%

face_amount_band

08 - 1,000,000+

ltp

20 yr

97.5%

face_amount_band

07 - 500,000 - 999,999

ltp

25 yr

96.6%

face_amount_band

05 - 100,000 - 249,999

ltp

30 yr

101.9%

face_amount_band

06 - 250,000 - 499,999

ltp

30 yr

99.6%

face_amount_band

07 - 500,000 - 999,999

ltp

30 yr

96.5%

face_amount_band

08 - 1,000,000+

ltp

30 yr

96.6%

face_amount_band

04 - 50,000 - 99,999

ltp

Not Level Term

90.2%

face_amount_band

05 - 100,000 - 249,999

ltp

Not Level Term

98.6%

face_amount_band

07 - 500,000 - 999,999

ltp

Not Level Term

98.3%

face_amount_band

04 - 50,000 - 99,999

ltp

Unknown

101.0%

face_amount_band

06 - 250,000 - 499,999

ltp

Unknown

95.7%

face_amount_band

07 - 500,000 - 999,999

ltp

Unknown

90.9%

uw

N/2/2

ltp

10 yr

109.2%

uw

N/3/3

ltp

10 yr

103.5%

uw

N/4/1

ltp

10 yr

108.2%

uw

N/4/3

ltp

10 yr

83.4%

uw

S/1/1

ltp

10 yr

108.0%

uw

S/2/2

ltp

10 yr

100.4%

uw

N/2/1

ltp

15 yr

98.1%

uw

N/3/1

ltp

15 yr

95.6%

uw

N/3/2

ltp

15 yr

89.6%

uw

N/4/1

ltp

15 yr

98.3%

uw

N/4/4

ltp

15 yr

104.5%

uw

N/2/2

ltp

20 yr

107.1%

uw

N/3/1

ltp

20 yr

91.9%

uw

N/3/2

ltp

20 yr

97.5%

uw

N/3/3

ltp

20 yr

106.0%

uw

N/4/2

ltp

20 yr

100.8%

uw

N/4/4

ltp

20 yr

108.5%

uw

S/1/1

ltp

20 yr

112.8%

uw

S/2/2

ltp

20 yr

105.5%

uw

N/2/2

ltp

25 yr

102.2%

uw

N/3/1

ltp

25 yr

91.9%

uw

N/3/2

ltp

25 yr

79.8%

uw

N/2/2

ltp

30 yr

99.8%

uw

N/3/1

ltp

30 yr

91.4%

uw

N/3/2

ltp

30 yr

83.9%

uw

N/3/3

ltp

30 yr

95.6%

uw

N/4/1

ltp

30 yr

95.5%

uw

N/4/3

ltp

30 yr

104.9%

uw

N/4/4

ltp

30 yr

103.0%

uw

N/2/1

ltp

Not Level Term

87.2%

uw

N/4/1

ltp

Not Level Term

99.9%

uw

N/4/2

ltp

Not Level Term

89.8%

uw

N/4/3

ltp

Not Level Term

81.3%

uw

S/2/1

ltp

Not Level Term

90.3%

uw

U/1/1

ltp

Not Level Term

112.4%

uw

N/2/1

ltp

Unknown

103.9%

uw

N/2/2

ltp

Unknown

116.5%

uw

N/3/1

ltp

Unknown

100.1%

uw

N/3/2

ltp

Unknown

113.9%

uw

N/3/3

ltp

Unknown

111.1%

uw

N/4/4

ltp

Unknown

86.8%

uw

S/1/1

ltp

Unknown

102.7%

face_amount_band

08 - 1,000,000+

iy_band1

1990-1999

93.6%

face_amount_band

05 - 100,000 - 249,999

iy_band1

2000-2009

98.5%

face_amount_band

06 - 250,000 - 499,999

iy_band1

2000-2009

93.7%

face_amount_band

07 - 500,000 - 999,999

iy_band1

2000-2009

89.1%

face_amount_band

08 - 1,000,000+

iy_band1

2000-2009

83.7%

face_amount_band

04 - 50,000 - 99,999

iy_band1

2010+

110.1%

face_amount_band

06 - 250,000 - 499,999

iy_band1

2010+

90.1%

face_amount_band

07 - 500,000 - 999,999

iy_band1

2010+

80.6%

face_amount_band

08 - 1,000,000+

iy_band1

2010+

71.6%

ia_band1

25-34

ltp

10 yr

100.2%

ia_band1

35-44

ltp

10 yr

100.9%

ia_band1

55-64

ltp

10 yr

93.1%

ia_band1

25-34

ltp

15 yr

99.4%

ia_band1

35-44

ltp

15 yr

104.0%

ia_band1

45-54

ltp

15 yr

93.5%

ia_band1

65-74

ltp

15 yr

117.0%

ia_band1

35-44

ltp

20 yr

99.2%

ia_band1

45-54

ltp

20 yr

95.3%

ia_band1

25-34

ltp

25 yr

98.4%

ia_band1

35-44

ltp

25 yr

97.5%

ia_band1

25-34

ltp

30 yr

96.8%

ia_band1

25-34

ltp

Not Level Term

100.1%

ia_band1

35-44

ltp

Not Level Term

104.2%

ia_band1

55-64

ltp

Not Level Term

96.1%

ia_band1

25-34

ltp

Unknown

89.8%

ia_band1

35-44

ltp

Unknown

94.1%

ia_band1

75-84

ltp

Unknown

158.4%

uw

N/2/2

insurance_plan

Perm

108.7%

uw

N/3/2

insurance_plan

Perm

102.1%

uw

N/3/3

insurance_plan

Perm

119.5%

uw

S/1/1

insurance_plan

Perm

95.4%

uw

U/1/1

insurance_plan

Perm

116.0%

uw

N/2/2

insurance_plan

Term

120.3%

uw

N/3/1

insurance_plan

Term

85.1%

uw

N/4/4

insurance_plan

Term

110.6%

uw

S/1/1

insurance_plan

Term

108.2%

uw

S/2/2

insurance_plan

Term

101.4%

uw

N/2/1

insurance_plan

xL

99.0%

uw

N/3/1

insurance_plan

xL

73.0%

uw

N/3/2

insurance_plan

xL

79.2%

uw

N/3/3

insurance_plan

xL

100.8%

uw

N/4/1

insurance_plan

xL

85.2%

uw

N/4/2

insurance_plan

xL

90.9%

uw

N/4/4

insurance_plan

xL

101.1%

uw

S/2/1

insurance_plan

xL

95.2%

uw

S/2/2

insurance_plan

xL

95.2%

ia_band1

25-34

gender

M

107.7%

ia_band1

35-44

gender

M

106.4%

ia_band1

45-54

gender

M

104.0%

ia_band1

65-74

gender

M

99.0%

ia_band1

75-84

gender

M

96.3%

ia_band1

85-99

gender

M

97.0%

ia_band1

35-44

iy_band1

1990-1999

96.0%

ia_band1

65-74

iy_band1

1990-1999

102.2%

ia_band1

75-84

iy_band1

1990-1999

99.5%

ia_band1

25-34

iy_band1

2000-2009

96.7%

ia_band1

55-64

iy_band1

2000-2009

96.8%

ia_band1

75-84

iy_band1

2000-2009

92.7%

ia_band1

85-99

iy_band1

2000-2009

97.1%

ia_band1

25-34

iy_band1

2010+

104.9%

ia_band1

35-44

iy_band1

2010+

107.2%

ia_band1

65-74

iy_band1

2010+

87.4%

uw

N/2/1

gender

M

104.1%

uw

N/2/2

gender

M

100.3%

uw

N/3/2

gender

M

103.2%

uw

N/3/3

gender

M

98.3%

uw

N/4/2

gender

M

111.9%

uw

N/4/3

gender

M

109.3%

uw

N/4/4

gender

M

101.2%

uw

S/1/1

gender

M

98.0%

uw

S/2/2

gender

M

98.5%

face_amount_band

04 - 50,000 - 99,999

insurance_plan

Perm

95.5%

face_amount_band

05 - 100,000 - 249,999

insurance_plan

Perm

93.3%

face_amount_band

06 - 250,000 - 499,999

insurance_plan

Perm

88.1%

face_amount_band

07 - 500,000 - 999,999

insurance_plan

Perm

80.5%

face_amount_band

04 - 50,000 - 99,999

insurance_plan

Term

102.0%

face_amount_band

06 - 250,000 - 499,999

insurance_plan

Term

96.0%

face_amount_band

07 - 500,000 - 999,999

insurance_plan

Term

91.7%

face_amount_band

08 - 1,000,000+

insurance_plan

Term

94.5%

ia_band1

25-34

insurance_plan

Perm

106.3%

ia_band1

45-54

insurance_plan

Perm

98.9%

ia_band1

55-64

insurance_plan

Perm

94.0%

ia_band1

75-84

insurance_plan

Perm

100.7%

ia_band1

85-99

insurance_plan

Perm

125.4%

ia_band1

25-34

insurance_plan

Term

99.6%

ia_band1

25-34

insurance_plan

xL

106.2%

ia_band1

35-44

insurance_plan

xL

102.0%

ia_band1

45-54

insurance_plan

xL

101.4%

ia_band1

65-74

insurance_plan

xL

94.4%

ia_band1

75-84

insurance_plan

xL

97.9%

ia_band1

85-99

insurance_plan

xL

90.4%

uw

N/2/2

iy_band1

1990-1999

107.1%

uw

N/3/2

iy_band1

1990-1999

112.0%

uw

N/4/1

iy_band1

1990-1999

94.2%

uw

N/4/2

iy_band1

1990-1999

94.5%

uw

S/2/1

iy_band1

1990-1999

112.5%

uw

S/2/2

iy_band1

1990-1999

118.0%

uw

N/2/1

iy_band1

2000-2009

91.7%

uw

N/2/2

iy_band1

2000-2009

103.2%

uw

N/3/1

iy_band1

2000-2009

97.8%

uw

N/3/3

iy_band1

2000-2009

105.2%

uw

N/4/1

iy_band1

2000-2009

98.8%

uw

S/1/1

iy_band1

2000-2009

101.6%

uw

S/2/2

iy_band1

2000-2009

109.9%

uw

N/2/1

iy_band1

2010+

87.9%

uw

N/3/3

iy_band1

2010+

100.9%

uw

S/2/1

iy_band1

2010+

95.8%

dur_band1

02

gender

M

102.6%

dur_band1

04-05

gender

M

103.8%

insurance_plan

Term

ltp

Not Level Term

75.8%

insurance_plan

xL

ltp

Not Level Term

100.4%

dur_band1

04-05

iy_band1

2000-2009

101.4%

dur_band1

06-15

iy_band1

2000-2009

99.7%

dur_band1

16-25

iy_band1

2000-2009

95.5%

dur_band1

04-05

iy_band1

2010+

97.8%

dur_band1

02

insurance_plan

Perm

129.7%

dur_band1

16-25

insurance_plan

Perm

87.3%

dur_band1

02

insurance_plan

xL

96.5%

dur_band1

03

insurance_plan

xL

109.2%

dur_band1

04-05

insurance_plan

xL

97.9%

dur_band1

06-15

insurance_plan

xL

104.0%

gender

M

insurance_plan

Perm

102.2%

ltp

20 yr

iy_band1

1990-1999

98.9%

ltp

25 yr

iy_band1

1990-1999

83.5%

ltp

30 yr

iy_band1

1990-1999

90.9%

ltp

Not Level Term

iy_band1

1990-1999

96.7%

ltp

Unknown

iy_band1

1990-1999

104.5%

ltp

15 yr

iy_band1

2000-2009

96.3%

ltp

Unknown

iy_band1

2000-2009

95.5%

insurance_plan

xL

iy_band1

1990-1999

96.8%

insurance_plan

xL

iy_band1

2000-2009

98.9%

insurance_plan

xL

iy_band1

2010+

103.7%

gender

M

iy_band1

1990-1999

96.3%

gender

M

iy_band1

2010+

103.5%

gender

M

ltp

15 yr

101.8%

gender

M

ltp

20 yr

101.0%

Lift
#-----------------------------------------#
##### Calculate Validation Metrics #####
#-----------------------------------------#

## generate plot
test[,decile.table(get(resp_var), predictions_glmnet/get(resp_offset), get(resp_offset))] %>%
  pivot_longer(-c(decile,exposures)) %>%
  as.data.table() %>%
  ggplot(aes(x=decile, y=value, col=name)) +  
  geom_line() +
  scale_x_continuous(breaks=c(1:10)) +
  labs(x="Decile",y="Deaths") +
  theme_minimal() +
  ggtitle("Decile Lift Plot") 

Lorenz Curve
## lorenz plot
test[,lorenz(get(resp_var), predictions_glmnet / get(resp_offset), get(resp_offset))]

Plots of Terms

## Generate a grid of predictor levels for the training data
train[, ..pred_cols] %>%
  lapply(levels) %>%
  expand.grid() %>%
  setDT() -> train.grid

## Create model matrix for the training grid
train.grid %>%
  model.Matrix(
    object = glmnetFormula,
    data = .,
    sparse = bUseSparse
  ) %>%
  ## Predict new coefficients using the fitted cv.glmnet model
  predict(
    cvfit,
    newx = .,
    s = "lambda.min",
    newoffset = rep(0, nrow(.))
  ) %>%
  as.vector() -> newCoef

## Add the predicted factors to the training grid
train.grid %>%
  add_column(Factor = exp(newCoef)) %>%
  setDT() -> train.grid

## Reformat coefficients and generate the list of interactions
reformatCoefs(cvfit, pred_cols) %>%
  filter(Coef != 0 & !is.na(Feature2Name)) %>%
  select(Feature1Name, Feature2Name) %>%
  distinct() %>%
  as.list() %>%
  purrr::list_transpose() -> glmnet.int.list

Below are plots of the 2-way interaction terms, with external factors fixed at their middle values.

## Generate and print plots for each interaction pair in glmnet.int.list
glmnet.int.list %>%
  map(.f = \(x) {
    ## Generate a plot of cross-validated net coefficients
    plotCVNetCoefs(
      train.grid,
      sort(x),
      "Factor",
      pred_cols
    )
  }) %>%
  ## Set names for each plot based on the interaction pairs
  purrr::set_names(
    map(glmnet.int.list, \(x) paste0(x, collapse = " x "))
  ) %>%
  ## Print each plot with a corresponding title
  iwalk(~ {
    cat('#### ', .y, '\n\n')  # Print the plot title
    print(.x)  # Print the plot
    cat('\n\n')  # Add spacing after each plot
  })

uw x ia_band1

uw x face_amount_band

face_amount_band x ia_band1

face_amount_band x dur_band1

uw x dur_band1

dur_band1 x ia_band1

face_amount_band x gender

dur_band1 x ltp

face_amount_band x ltp

uw x ltp

face_amount_band x iy_band1

ia_band1 x ltp

uw x insurance_plan

ia_band1 x gender

ia_band1 x iy_band1

uw x gender

face_amount_band x insurance_plan

ia_band1 x insurance_plan

uw x iy_band1

dur_band1 x gender

insurance_plan x ltp

dur_band1 x iy_band1

dur_band1 x insurance_plan

gender x insurance_plan

ltp x iy_band1

insurance_plan x iy_band1

gender x iy_band1

gender x ltp

Tables of Terms

Below are tables of 2-way interactions, with external factors fixed at their middle values.

## Generate and format summary tables for each interaction pair in glmnet.int.list
glmnet.int.list %>%
  map(.f = \(x) {
    # Generate a table of cross-validated net coefficients
    tableCVNetCoefs(
      train.grid,
      sort(x),
      "Factor",
      pred.cols = pred_cols
    ) %>%
      ## Create a flextable from the generated table
      flextable() %>%
      ## Format the table values, converting numeric values to percentages
      set_formatter(values = function(x) {
        if (is.numeric(x))
          sprintf("%.1f%%", x * 100)
        else
          x
      }) %>%
      ## Set a caption for the table
      set_caption(
        paste0("Factors for ", x[1], " and ", x[2], ", all other factors fixed at middle levels")
      ) %>%
      ## Set table properties to enable scrolling
      set_table_properties(opts_html = list(
        scroll = list(
          add_css = "max-height: 500px;"
        )
      )) %>%
      autofit() #%>%
    ## Print the flextable
    #knitr::knit_print()
  }) %>%
  ## Set names for each table based on the interaction pairs
  purrr::set_names(
    map(glmnet.int.list, \(x) paste0(x, collapse = " x "))
  ) -> 
  output_tables

if(output_format == "html") {
  output_tables %>%
    map(.f=knitr::knit_print) %>%
    ## Generate a tabset from the list of tables
    generate_tabset(
      tabtitle = "",
      tablevel = 3
    ) %>%
    ## Print the generated tabset
    cat()
} else {
  export_tables_to_excel(
    output_tables,
    paste0(
      exportsRoot,
      "_glmnet_int_twoway_interaction_tables.xlsx"
    )
  )
  
  cat("See included Excel table for additional information.\n")
}

uw x ia_band1

Factors for uw and ia_band1, all other factors fixed at middle levels

ia_band1

N/1/1

N/2/1

N/2/2

N/3/1

N/3/2

N/3/3

N/4/1

N/4/2

N/4/3

N/4/4

S/1/1

S/2/1

S/2/2

U/1/1

18-24

97.1%

97.1%

148.3%

95.6%

108.3%

123.0%

75.3%

68.4%

97.1%

108.1%

104.5%

108.6%

120.9%

112.6%

25-34

95.2%

95.2%

158.7%

94.6%

107.0%

147.5%

73.9%

67.0%

108.6%

115.3%

99.6%

101.5%

118.5%

110.4%

35-44

91.9%

90.2%

147.3%

90.5%

104.4%

153.1%

71.9%

66.9%

91.9%

114.8%

98.8%

97.8%

114.3%

114.5%

45-54

86.7%

85.2%

144.9%

85.3%

94.4%

130.6%

67.2%

60.9%

94.9%

110.8%

93.2%

96.9%

113.6%

102.3%

55-64

84.8%

77.7%

129.5%

75.7%

81.8%

113.7%

54.3%

56.1%

84.8%

94.4%

99.8%

93.1%

97.2%

99.9%

65-74

105.5%

105.5%

161.2%

92.0%

101.1%

129.9%

70.0%

60.1%

98.2%

117.5%

113.6%

118.1%

135.3%

120.2%

75-84

100.9%

114.4%

150.9%

92.2%

93.1%

106.2%

40.0%

97.5%

90.3%

101.8%

129.2%

112.8%

121.5%

117.0%

85-99

121.8%

232.6%

187.8%

119.8%

153.4%

128.7%

94.5%

76.3%

101.5%

119.4%

131.0%

193.2%

256.9%

141.2%

uw x face_amount_band

Factors for uw and face_amount_band, all other factors fixed at middle levels

face_amount_band

N/1/1

N/2/1

N/2/2

N/3/1

N/3/2

N/3/3

N/4/1

N/4/2

N/4/3

N/4/4

S/1/1

S/2/1

S/2/2

U/1/1

01 - 0 - 49,999

120.8%

118.8%

200.1%

118.8%

131.6%

182.0%

98.4%

88.3%

132.2%

154.4%

129.9%

135.8%

158.3%

142.6%

04 - 50,000 - 99,999

115.3%

113.4%

192.5%

113.5%

125.6%

173.2%

93.9%

84.3%

126.2%

147.4%

124.1%

129.7%

151.1%

136.1%

05 - 100,000 - 249,999

86.7%

85.2%

144.9%

85.3%

94.4%

130.6%

67.2%

60.9%

94.9%

110.8%

93.2%

96.9%

113.6%

102.3%

06 - 250,000 - 499,999

80.8%

79.4%

133.8%

79.5%

88.0%

121.7%

65.1%

59.0%

84.0%

103.3%

86.9%

90.9%

105.9%

95.4%

07 - 500,000 - 999,999

78.8%

77.0%

129.5%

77.6%

85.7%

122.3%

63.2%

57.6%

90.7%

107.4%

81.1%

89.5%

119.9%

93.1%

08 - 1,000,000+

94.7%

85.3%

163.9%

92.4%

107.2%

163.7%

79.8%

67.8%

107.7%

145.1%

101.9%

104.1%

124.2%

123.9%

face_amount_band x ia_band1

Factors for face_amount_band and ia_band1, all other factors fixed at middle levels

face_amount_band

18-24

25-34

35-44

45-54

55-64

65-74

75-84

85-99

01 - 0 - 49,999

102.7%

109.2%

100.4%

98.4%

74.0%

89.7%

52.6%

128.7%

04 - 50,000 - 99,999

98.0%

106.8%

103.0%

93.9%

69.3%

85.7%

49.0%

122.9%

05 - 100,000 - 249,999

75.3%

73.9%

71.9%

67.2%

54.3%

70.0%

40.0%

94.5%

06 - 250,000 - 499,999

76.7%

71.3%

70.0%

65.1%

55.3%

73.3%

42.5%

96.2%

07 - 500,000 - 999,999

74.1%

71.7%

66.7%

63.2%

53.4%

72.9%

42.5%

95.5%

08 - 1,000,000+

92.4%

85.7%

84.8%

79.8%

66.6%

82.7%

47.3%

99.0%

face_amount_band x dur_band1

Factors for face_amount_band and dur_band1, all other factors fixed at middle levels

dur_band1

01 - 0 - 49,999

04 - 50,000 - 99,999

05 - 100,000 - 249,999

06 - 250,000 - 499,999

07 - 500,000 - 999,999

08 - 1,000,000+

01

97.9%

93.4%

66.9%

64.7%

62.9%

80.1%

02

101.0%

96.5%

69.1%

66.8%

64.9%

80.9%

03

98.4%

93.9%

67.2%

65.1%

63.2%

79.8%

04-05

97.9%

93.4%

65.8%

63.7%

62.9%

80.1%

06-15

88.9%

84.9%

58.9%

58.8%

55.6%

68.0%

16-25

72.7%

66.9%

48.6%

48.1%

47.3%

63.4%

uw x dur_band1

Factors for uw and dur_band1, all other factors fixed at middle levels

dur_band1

N/1/1

N/2/1

N/2/2

N/3/1

N/3/2

N/3/3

N/4/1

N/4/2

N/4/3

N/4/4

S/1/1

S/2/1

S/2/2

U/1/1

01

85.3%

83.8%

121.6%

78.4%

92.9%

128.4%

66.9%

71.0%

93.3%

106.2%

91.7%

95.4%

111.7%

100.6%

02

108.1%

152.6%

154.2%

99.4%

120.3%

162.9%

69.1%

90.0%

134.2%

134.7%

116.3%

120.9%

141.7%

127.6%

03

86.7%

85.2%

144.9%

85.3%

94.4%

130.6%

67.2%

60.9%

94.9%

110.8%

93.2%

96.9%

113.6%

102.3%

04-05

83.9%

66.8%

119.6%

74.7%

91.4%

135.0%

65.8%

69.8%

90.8%

104.5%

90.2%

91.7%

109.9%

99.0%

06-15

75.0%

71.2%

107.6%

72.7%

84.4%

113.0%

58.9%

62.5%

88.2%

94.8%

80.7%

74.5%

98.3%

90.8%

16-25

61.9%

61.2%

89.8%

55.3%

61.3%

82.2%

48.6%

50.1%

67.8%

62.9%

72.1%

69.3%

84.3%

74.1%

dur_band1 x ia_band1

Factors for dur_band1 and ia_band1, all other factors fixed at middle levels

dur_band1

18-24

25-34

35-44

45-54

55-64

65-74

75-84

85-99

01

76.2%

74.7%

71.3%

66.9%

59.1%

70.8%

40.4%

95.6%

02

80.5%

78.9%

75.6%

69.1%

68.4%

74.8%

37.4%

101.0%

03

75.3%

73.9%

71.9%

67.2%

54.3%

70.0%

40.0%

94.5%

04-05

75.0%

73.6%

64.5%

65.8%

56.5%

70.8%

45.0%

77.9%

06-15

67.1%

65.7%

58.8%

58.9%

52.0%

60.2%

35.6%

83.3%

16-25

53.8%

52.3%

49.0%

48.6%

46.8%

51.6%

28.4%

37.9%

face_amount_band x gender

Factors for face_amount_band and gender, all other factors fixed at middle levels

face_amount_band

F

M

01 - 0 - 49,999

98.4%

108.7%

04 - 50,000 - 99,999

93.9%

107.2%

05 - 100,000 - 249,999

67.2%

74.3%

06 - 250,000 - 499,999

65.1%

67.1%

07 - 500,000 - 999,999

63.2%

63.4%

08 - 1,000,000+

79.8%

79.0%

dur_band1 x ltp

Factors for dur_band1 and ltp, all other factors fixed at middle levels

dur_band1

5 yr

10 yr

15 yr

20 yr

25 yr

30 yr

Not Level Term

Unknown

01

74.3%

79.6%

62.8%

66.9%

62.0%

65.7%

70.8%

77.6%

02

76.7%

82.2%

64.8%

69.1%

64.0%

67.8%

73.1%

80.2%

03

74.6%

80.0%

63.5%

67.2%

62.3%

66.0%

71.1%

78.0%

04-05

73.1%

77.2%

59.1%

65.8%

61.2%

72.5%

68.6%

76.4%

06-15

71.9%

80.1%

58.3%

58.9%

56.5%

63.6%

68.5%

75.2%

16-25

65.2%

86.1%

122.9%

48.6%

52.8%

53.5%

62.2%

68.2%

face_amount_band x ltp

Factors for face_amount_band and ltp, all other factors fixed at middle levels

face_amount_band

5 yr

10 yr

15 yr

20 yr

25 yr

30 yr

Not Level Term

Unknown

01 - 0 - 49,999

104.4%

113.0%

96.6%

98.4%

87.2%

90.6%

100.9%

109.1%

04 - 50,000 - 99,999

99.7%

107.9%

89.7%

93.9%

83.2%

86.5%

86.9%

105.2%

05 - 100,000 - 249,999

74.6%

80.0%

63.5%

67.2%

62.3%

66.0%

71.1%

78.0%

06 - 250,000 - 499,999

70.0%

75.7%

64.7%

65.1%

58.4%

60.5%

67.6%

70.0%

07 - 500,000 - 999,999

68.8%

76.7%

63.6%

63.2%

55.5%

57.6%

65.4%

65.4%

08 - 1,000,000+

86.8%

92.1%

82.1%

79.8%

72.5%

72.8%

83.9%

90.8%

uw x ltp

Factors for uw and ltp, all other factors fixed at middle levels

ltp

N/1/1

N/2/1

N/2/2

N/3/1

N/3/2

N/3/3

N/4/1

N/4/2

N/4/3

N/4/4

S/1/1

S/2/1

S/2/2

U/1/1

5 yr

96.2%

94.6%

150.3%

103.0%

107.4%

136.8%

74.6%

67.1%

105.3%

113.4%

91.8%

107.6%

119.5%

113.6%

10 yr

95.3%

93.7%

162.5%

102.0%

106.4%

140.2%

80.0%

66.5%

87.0%

112.3%

98.2%

106.6%

118.9%

112.5%

15 yr

83.2%

80.3%

130.0%

85.2%

83.3%

118.4%

63.5%

58.1%

91.1%

102.5%

79.4%

93.1%

103.4%

98.3%

20 yr

86.7%

85.2%

144.9%

85.3%

94.4%

130.6%

67.2%

60.9%

94.9%

110.8%

93.2%

96.9%

113.6%

102.3%

25 yr

80.3%

79.0%

128.2%

79.0%

71.6%

114.2%

62.3%

56.0%

87.9%

94.7%

76.6%

89.9%

99.8%

94.8%

30 yr

89.1%

87.7%

138.9%

87.2%

83.5%

121.2%

66.0%

62.2%

102.4%

108.2%

85.1%

99.7%

110.7%

105.2%

Not Level Term

91.8%

78.7%

143.4%

98.2%

102.5%

130.5%

71.1%

57.5%

81.7%

108.2%

87.6%

92.7%

114.0%

121.8%

Unknown

100.6%

102.7%

183.0%

107.7%

127.9%

158.9%

78.0%

70.1%

110.1%

102.9%

98.5%

112.5%

124.9%

118.7%

face_amount_band x iy_band1

Factors for face_amount_band and iy_band1, all other factors fixed at middle levels

face_amount_band

1900-1989

1990-1999

2000-2009

2010+

01 - 0 - 49,999

105.6%

98.4%

104.3%

105.6%

04 - 50,000 - 99,999

100.8%

93.9%

99.6%

111.0%

05 - 100,000 - 249,999

72.2%

67.2%

70.3%

72.2%

06 - 250,000 - 499,999

69.8%

65.1%

64.7%

63.0%

07 - 500,000 - 999,999

67.8%

63.2%

59.7%

54.7%

08 - 1,000,000+

91.5%

79.8%

75.6%

65.5%

ia_band1 x ltp

Factors for ia_band1 and ltp, all other factors fixed at middle levels

ia_band1

5 yr

10 yr

15 yr

20 yr

25 yr

30 yr

Not Level Term

Unknown

18-24

79.7%

85.4%

72.5%

75.3%

66.5%

70.5%

75.9%

83.3%

25-34

78.1%

83.9%

70.7%

73.9%

64.2%

66.9%

74.5%

73.3%

35-44

76.7%

83.0%

72.6%

71.9%

62.4%

67.8%

76.1%

75.4%

45-54

74.6%

80.0%

63.5%

67.2%

62.3%

66.0%

71.1%

78.0%

55-64

57.5%

57.3%

52.3%

54.3%

48.0%

50.8%

52.6%

60.1%

65-74

74.1%

79.4%

78.9%

70.0%

61.9%

65.6%

70.6%

77.4%

75-84

42.3%

45.3%

38.5%

40.0%

35.3%

37.4%

40.3%

70.0%

85-99

100.0%

107.1%

90.9%

94.5%

83.5%

88.4%

95.3%

104.5%

uw x insurance_plan

Factors for uw and insurance_plan, all other factors fixed at middle levels

insurance_plan

N/1/1

N/2/1

N/2/2

N/3/1

N/3/2

N/3/3

N/4/1

N/4/2

N/4/3

N/4/4

S/1/1

S/2/1

S/2/2

U/1/1

Other

94.0%

92.4%

144.6%

92.5%

100.3%

118.4%

72.9%

66.1%

102.9%

120.1%

105.9%

105.1%

123.1%

95.6%

Perm

86.7%

85.2%

144.9%

85.3%

94.4%

130.6%

67.2%

60.9%

94.9%

110.8%

93.2%

96.9%

113.6%

102.3%

Term

94.0%

92.4%

173.9%

78.7%

100.3%

118.4%

72.9%

66.1%

102.9%

132.8%

114.6%

105.1%

124.9%

95.6%

xL

100.8%

98.2%

155.2%

72.4%

85.2%

128.1%

66.6%

64.5%

110.4%

130.3%

113.7%

107.4%

125.8%

102.7%

ia_band1 x gender

Factors for ia_band1 and gender, all other factors fixed at middle levels

gender

18-24

25-34

35-44

45-54

55-64

65-74

75-84

85-99

F

75.3%

73.9%

71.9%

67.2%

54.3%

70.0%

40.0%

94.5%

M

80.0%

84.5%

81.2%

74.3%

57.7%

73.7%

40.9%

97.4%

ia_band1 x iy_band1

Factors for ia_band1 and iy_band1, all other factors fixed at middle levels

ia_band1

1900-1989

1990-1999

2000-2009

2010+

18-24

80.9%

75.3%

78.7%

80.9%

25-34

79.3%

73.9%

74.6%

83.2%

35-44

80.3%

71.9%

78.2%

86.1%

45-54

72.2%

67.2%

70.3%

72.2%

55-64

58.3%

54.3%

55.0%

58.3%

65-74

73.6%

70.0%

71.7%

64.3%

75-84

43.1%

40.0%

38.9%

43.1%

85-99

101.4%

94.5%

95.9%

101.4%

uw x gender

Factors for uw and gender, all other factors fixed at middle levels

gender

N/1/1

N/2/1

N/2/2

N/3/1

N/3/2

N/3/3

N/4/1

N/4/2

N/4/3

N/4/4

S/1/1

S/2/1

S/2/2

U/1/1

F

86.7%

85.2%

144.9%

85.3%

94.4%

130.6%

67.2%

60.9%

94.9%

110.8%

93.2%

96.9%

113.6%

102.3%

M

95.7%

98.0%

160.6%

94.2%

107.6%

141.8%

74.3%

75.3%

114.6%

123.8%

100.9%

107.1%

123.6%

113.0%

face_amount_band x insurance_plan

Factors for face_amount_band and insurance_plan, all other factors fixed at middle levels

face_amount_band

Other

Perm

Term

xL

01 - 0 - 49,999

99.4%

98.4%

99.4%

90.9%

04 - 50,000 - 99,999

99.4%

93.9%

101.5%

90.9%

05 - 100,000 - 249,999

72.9%

67.2%

72.9%

66.6%

06 - 250,000 - 499,999

74.6%

65.1%

71.7%

68.2%

07 - 500,000 - 999,999

79.4%

63.2%

72.8%

72.5%

08 - 1,000,000+

80.6%

79.8%

76.2%

73.7%

ia_band1 x insurance_plan

Factors for ia_band1 and insurance_plan, all other factors fixed at middle levels

ia_band1

Other

Perm

Term

xL

18-24

80.8%

75.3%

80.8%

72.8%

25-34

74.5%

73.9%

74.1%

71.3%

35-44

77.1%

71.9%

77.1%

70.9%

45-54

72.9%

67.2%

72.9%

66.6%

55-64

62.0%

54.3%

62.0%

55.9%

65-74

75.1%

70.0%

75.1%

63.9%

75-84

42.6%

40.0%

42.6%

37.6%

85-99

80.8%

94.5%

80.8%

65.8%

uw x iy_band1

Factors for uw and iy_band1, all other factors fixed at middle levels

iy_band1

N/1/1

N/2/1

N/2/2

N/3/1

N/3/2

N/3/3

N/4/1

N/4/2

N/4/3

N/4/4

S/1/1

S/2/1

S/2/2

U/1/1

1900-1989

87.7%

86.2%

136.9%

86.3%

85.3%

132.1%

72.2%

65.2%

96.0%

112.1%

94.3%

87.2%

97.4%

103.5%

1990-1999

86.7%

85.2%

144.9%

85.3%

94.4%

130.6%

67.2%

60.9%

94.9%

110.8%

93.2%

96.9%

113.6%

102.3%

2000-2009

86.4%

77.9%

139.3%

83.2%

84.0%

136.9%

70.3%

64.3%

94.6%

110.4%

94.4%

85.9%

105.5%

102.0%

2010+

87.7%

75.8%

136.9%

86.3%

85.3%

133.2%

72.2%

65.2%

96.0%

112.1%

94.3%

83.5%

97.4%

103.5%

dur_band1 x gender

Factors for dur_band1 and gender, all other factors fixed at middle levels

dur_band1

F

M

01

66.9%

73.9%

02

69.1%

78.3%

03

67.2%

74.3%

04-05

65.8%

75.4%

06-15

58.9%

65.0%

16-25

48.6%

53.7%

insurance_plan x ltp

Factors for insurance_plan and ltp, all other factors fixed at middle levels

insurance_plan

5 yr

10 yr

15 yr

20 yr

25 yr

30 yr

Not Level Term

Unknown

Other

80.9%

86.7%

68.8%

72.9%

67.6%

71.6%

77.1%

84.6%

Perm

74.6%

80.0%

63.5%

67.2%

62.3%

66.0%

71.1%

78.0%

Term

80.9%

86.7%

68.8%

72.9%

67.6%

71.6%

58.5%

84.6%

xL

74.0%

79.3%

62.9%

66.6%

61.8%

65.4%

70.8%

77.3%

dur_band1 x iy_band1

Factors for dur_band1 and iy_band1, all other factors fixed at middle levels

dur_band1

1900-1989

1990-1999

2000-2009

2010+

01

71.8%

66.9%

69.9%

71.8%

02

74.1%

69.1%

72.2%

74.1%

03

72.2%

67.2%

70.3%

72.2%

04-05

70.6%

65.8%

69.8%

69.1%

06-15

63.2%

58.9%

61.4%

63.2%

16-25

52.2%

48.6%

48.5%

52.2%

dur_band1 x insurance_plan

Factors for dur_band1 and insurance_plan, all other factors fixed at middle levels

dur_band1

Other

Perm

Term

xL

01

72.5%

66.9%

72.5%

60.7%

02

57.7%

69.1%

57.7%

46.6%

03

72.9%

67.2%

72.9%

66.6%

04-05

71.3%

65.8%

71.3%

58.4%

06-15

63.8%

58.9%

63.8%

55.6%

16-25

60.4%

48.6%

60.4%

50.5%

gender x insurance_plan

Factors for gender and insurance_plan, all other factors fixed at middle levels

gender

Other

Perm

Term

xL

F

72.9%

67.2%

72.9%

66.6%

M

78.8%

74.3%

78.8%

72.0%

ltp x iy_band1

Factors for ltp and iy_band1, all other factors fixed at middle levels

iy_band1

5 yr

10 yr

15 yr

20 yr

25 yr

30 yr

Not Level Term

Unknown

1900-1989

79.2%

84.9%

67.4%

72.2%

79.2%

77.1%

78.0%

79.2%

1990-1999

74.6%

80.0%

63.5%

67.2%

62.3%

66.0%

71.1%

78.0%

2000-2009

77.1%

82.7%

63.2%

70.3%

77.1%

75.1%

76.0%

73.7%

2010+

79.2%

84.9%

67.4%

72.2%

79.2%

77.1%

78.0%

79.2%

insurance_plan x iy_band1

Factors for insurance_plan and iy_band1, all other factors fixed at middle levels

insurance_plan

1900-1989

1990-1999

2000-2009

2010+

Other

78.2%

72.9%

76.2%

78.2%

Perm

72.2%

67.2%

70.3%

72.2%

Term

78.2%

72.9%

76.2%

78.2%

xL

73.9%

66.6%

71.2%

76.6%

gender x iy_band1

Factors for gender and iy_band1, all other factors fixed at middle levels

gender

1900-1989

1990-1999

2000-2009

2010+

F

72.2%

67.2%

70.3%

72.2%

M

82.8%

74.3%

80.6%

85.7%

gender x ltp

Factors for gender and ltp, all other factors fixed at middle levels

gender

5 yr

10 yr

15 yr

20 yr

25 yr

30 yr

Not Level Term

Unknown

F

74.6%

80.0%

63.5%

67.2%

62.3%

66.0%

71.1%

78.0%

M

81.6%

87.5%

70.7%

74.3%

68.2%

72.2%

77.8%

85.3%

Goodness-of-Fit

Goodness-of-fit tables are provided. Each table provides actual-to-model ratios for single variables and for 2-way combinations of variables. A model is qualitatively deemed to perform well if goodness-of-fit ratios are close to 100% in almost all situations. The quantitative assessment using significance testing is omitted here.

Unvariate Goodness-of-Fit

## Generate and format summary tables for each factor column
map(factor_cols, .f = \(x) {
  ## Convert column name to symbol for tidy evaluation
  x <- sym(x)
  resp_var_sym <- resp_var
  
  ## Summarize data by the current factor column
  train %>%
    group_by(!!x) %>%
    summarize(
      Outcome = sum(amount_actual),  # Sum the actual amounts
      AM = sum(amount_actual) / sum(predictions_glmnet)  # Calculate the Actual-to-Model ratio
    ) %>%
    ## Create a flextable from the summarized data
    flextable() %>%
    ## Format the Actual-to-Model column as percentages
    set_formatter(
      AM = function(x) {
        if (is.numeric(x))
          sprintf("%.1f%%", x * 100)
        else
          x
      }
    ) %>%
    ## Format the Outcome column as numbers
    colformat_num(j = "Outcome") %>%
    ## Set header labels for the table
    set_header_labels(
      Outcome = "Outcome",
      AM = "Actual-to-Model"
    ) %>%
    autofit() #%>%
    # Print the flextable
    #knitr::knit_print()
}) %>%
  ## Set names for each table based on the factor columns
  purrr::set_names(factor_cols) ->
  output_tables

if(output_format == "html") {
  output_tables %>%
    map(.f=knitr::knit_print) %>%
    ## Generate a tabset from the list of tables
    generate_tabset(
      tabtitle = "",
      tablevel = 4
    ) %>%
    ## Print the generated tabset
    cat()
} else {
  export_tables_to_excel(
    output_tables,
    paste0(
      exportsRoot,
      "_glmnet_int_univariate_goodness_of_fit_tables.xlsx"
    )
  )
  
  cat("See included Excel table for additional information.\n")
}
uw

uw

Outcome

Actual-to-Model

N/1/1

19,203,624,009

100.0%

N/2/1

9,661,768,053

99.9%

N/2/2

11,546,531,146

100.2%

N/3/1

5,950,371,727

99.7%

N/3/2

8,810,968,555

99.9%

N/3/3

14,970,350,057

100.2%

N/4/1

6,842,959,121

99.6%

N/4/2

5,162,511,227

99.5%

N/4/3

3,565,224,892

99.9%

N/4/4

3,742,908,023

100.5%

S/1/1

4,365,025,076

100.5%

S/2/1

1,910,521,866

98.9%

S/2/2

1,704,933,729

100.8%

U/1/1

498,038,979

103.1%

face_amount_band

face_amount_band

Outcome

Actual-to-Model

01 - 0 - 49,999

4,193,809,072

103.7%

04 - 50,000 - 99,999

5,283,024,574

100.1%

05 - 100,000 - 249,999

16,241,007,604

99.7%

06 - 250,000 - 499,999

14,745,604,133

99.7%

07 - 500,000 - 999,999

15,180,744,107

99.8%

08 - 1,000,000+

42,291,546,970

99.9%

dur_band1

dur_band1

Outcome

Actual-to-Model

01

1,269,136,295

101.7%

02

1,852,767,081

99.2%

03

2,494,157,892

100.8%

04-05

6,321,445,332

99.8%

06-15

55,677,490,288

100.0%

16-25

30,320,739,572

100.0%

ia_band1

ia_band1

Outcome

Actual-to-Model

18-24

667,931,081

102.8%

25-34

6,267,527,815

99.9%

35-44

16,990,453,137

100.1%

45-54

20,629,510,315

100.0%

55-64

19,447,152,090

99.9%

65-74

16,608,837,516

100.0%

75-84

15,884,011,950

100.0%

85-99

1,440,312,556

99.7%

gender

gender

Outcome

Actual-to-Model

F

33,257,191,935

99.8%

M

64,678,544,525

100.1%

insurance_plan

insurance_plan

Outcome

Actual-to-Model

Other

187,352,300

100.5%

Perm

12,161,923,261

100.0%

Term

39,023,616,053

99.9%

xL

46,562,844,846

100.1%

ltp

ltp

Outcome

Actual-to-Model

5 yr

686,343,514

108.0%

10 yr

7,271,664,723

100.1%

15 yr

6,017,756,560

99.8%

20 yr

17,767,048,607

99.9%

25 yr

525,222,922

98.4%

30 yr

3,092,731,785

99.5%

Not Level Term

59,614,315,535

100.0%

Unknown

2,960,652,814

99.4%

iy_band1

iy_band1

Outcome

Actual-to-Model

1900-1989

1,464,179,514

103.2%

1990-1999

29,544,147,346

100.0%

2000-2009

55,045,293,308

99.9%

2010+

11,882,116,292

99.9%

Bivariate Goodness-of-Fit

## Create a list of unique pairs of factor columns
pairlist <- data.table()
for (i in 1:(length(factor_cols) - 1)) {
  for (j in (i + 1):length(factor_cols)) {
    if (i == 1 & j == 2) {
      pairlist <- data.table(F1 = factor_cols[i], F2 = factor_cols[j])
    } else {
      pairlist <- rbind(pairlist, data.table(F1 = factor_cols[i], F2 = factor_cols[j]))
    }
  }
}

## Generate and format summary tables for each pair of factor columns
map2(.x = pairlist$F1, .y = pairlist$F2, .f = \(x, y) {
  xs <- sym(x)
  ys <- sym(y)
  
  ## Choose grouping order based on the number of levels in each factor
  if (length(train[, levels(get(x))]) >= length(train[, levels(get(y))])) {
    fttmp <- train %>%
      group_by(!!xs, !!ys) %>%
      summarize(
        Outcome = sum(amount_actual),
        Ratio = sprintf("%.1f%%", 100 * sum(amount_actual) / sum(predictions_glmnet))
      ) %>%
      pivot_wider(
        names_from = !!ys,
        values_from = c(Outcome, Ratio),
        names_glue = paste0(y, ": {", y, "}.{.value}"),
        names_vary = "slowest"
      )
  } else {
    fttmp <- train %>%
      group_by(!!ys, !!xs) %>%
      summarize(
        Outcome = sum(amount_actual),
        Ratio = sprintf("%.1f%%", 100 * sum(amount_actual) / sum(predictions_glmnet))
      ) %>%
      pivot_wider(
        names_from = !!xs,
        values_from = c(Outcome, Ratio),
        names_glue = paste0(x, ": {", x, "}.{.value}"),
        names_vary = "slowest"
      )
  }
  
  ## Adjust column keys for the flextable
  fttmp.colkeys <- names(fttmp)[1]
  for (i in 1:((length(names(fttmp)) - 1) / 2)) {
    fttmp.colkeys <- c(fttmp.colkeys, paste0("blank", i), names(fttmp)[(2 * i):(2 * i + 1)])
  }
  
  ## Create and print the flextable
  fttmp %>%
    flextable(col_keys = fttmp.colkeys) %>%
    ftExtra::span_header(sep = "\\.") %>%
    align(align = 'center', part = "all") %>%
    empty_blanks() %>%
    autofit() #%>%
    #knitr::knit_print()
}) %>%
  ## Set names for each element in the list based on the factor column pairs
  purrr::set_names(pairlist[, paste0(F1, " x ", F2)]) ->
  output_tables

if(output_format == "html") {
  output_tables %>%
    map(.f=knitr::knit_print) %>%
    ## Generate a tabset from the list of formatted tables
    generate_tabset(
      tabtitle = "",
      tablevel = 4
    ) %>%
    ## Print the generated tabset
    cat()
} else {
  export_tables_to_excel(
    output_tables,
    paste0(
      exportsRoot,
      "_glmnet_int_bivariate_goodness_of_fit_tables.xlsx"
    )
  )
  
  cat("See included Excel table for additional information.\n")
}
uw x face_amount_band

uw

face_amount_band: 01 - 0 - 49,999

face_amount_band: 04 - 50,000 - 99,999

face_amount_band: 05 - 100,000 - 249,999

face_amount_band: 06 - 250,000 - 499,999

face_amount_band: 07 - 500,000 - 999,999

face_amount_band: 08 - 1,000,000+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

2,331,609,989

103.4%

2,665,573,911

101.2%

4,377,227,666

100.0%

2,484,222,240

99.5%

2,280,087,941

98.7%

5,064,902,262

98.6%

N/2/1

67,510,912

106.5%

227,491,364

103.1%

1,877,102,924

100.1%

1,581,998,866

99.8%

1,613,477,143

99.4%

4,294,186,844

99.8%

N/2/2

478,620,100

101.7%

716,716,216

101.7%

2,460,749,546

100.5%

1,807,687,840

99.7%

1,536,284,872

99.3%

4,546,472,572

100.2%

N/3/1

4,589,248

114.2%

36,548,841

96.9%

636,379,622

99.2%

1,099,392,788

99.6%

1,269,934,939

100.3%

2,903,526,289

99.6%

N/3/2

15,879,147

116.3%

84,477,290

110.9%

824,647,078

98.8%

1,194,151,445

99.2%

1,294,166,133

99.1%

5,397,647,462

100.2%

N/3/3

89,078,013

105.3%

266,651,094

95.9%

1,685,052,988

100.7%

1,897,715,243

99.8%

1,940,524,074

100.6%

9,091,328,645

100.1%

N/4/1

2,890,726

94.8%

23,168,994

88.2%

533,862,585

97.9%

1,152,655,538

99.0%

1,611,064,623

99.3%

3,519,316,655

100.3%

N/4/2

5,630,152

94.4%

43,592,461

88.3%

580,655,929

98.1%

959,152,506

101.2%

1,142,130,198

99.4%

2,431,349,981

99.6%

N/4/3

2,758,598

107.8%

19,493,024

84.7%

410,982,168

97.6%

652,807,204

98.4%

822,151,439

101.3%

1,657,032,459

100.6%

N/4/4

16,764,842

101.0%

69,558,070

96.3%

501,578,433

99.5%

668,382,546

100.3%

698,077,315

101.6%

1,788,546,817

100.6%

S/1/1

846,343,572

104.5%

864,666,577

99.7%

1,161,486,983

99.0%

510,128,489

100.9%

361,498,469

97.3%

620,900,986

100.9%

S/2/1

21,486,405

96.6%

85,321,233

92.5%

614,066,701

98.1%

400,530,746

99.7%

315,167,594

103.6%

473,949,187

97.9%

S/2/2

74,338,988

106.8%

135,934,467

92.6%

492,786,516

99.5%

294,838,420

102.0%

267,040,410

104.2%

439,994,928

101.2%

U/1/1

236,308,380

106.5%

43,831,032

90.9%

84,428,465

105.5%

41,940,262

95.9%

29,138,957

88.2%

62,391,883

111.2%

uw x dur_band1

uw

dur_band1: 01

dur_band1: 02

dur_band1: 03

dur_band1: 04-05

dur_band1: 06-15

dur_band1: 16-25

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

107,404,206

98.1%

127,611,866

99.5%

186,488,501

107.0%

495,415,170

112.5%

4,962,783,774

98.8%

13,323,920,492

99.9%

N/2/1

38,939,578

73.6%

117,924,754

107.3%

103,706,358

106.5%

187,685,672

94.8%

4,423,311,154

99.7%

4,790,200,537

100.3%

N/2/2

68,820,764

97.5%

89,429,859

92.9%

153,874,204

106.5%

321,513,019

99.2%

6,248,105,591

100.2%

4,664,787,709

100.3%

N/3/1

91,917,216

86.7%

162,393,895

97.8%

239,046,418

104.2%

463,263,819

97.5%

4,110,289,163

100.3%

883,461,216

98.7%

N/3/2

73,068,462

89.0%

142,447,178

107.1%

172,423,532

98.0%

471,435,974

98.0%

7,112,246,310

100.2%

839,347,099

98.6%

N/3/3

209,144,931

114.5%

274,764,174

97.7%

374,385,325

101.0%

1,050,466,173

101.2%

12,379,149,521

100.0%

682,439,933

98.2%

N/4/1

174,316,019

95.1%

228,792,240

96.0%

389,498,840

97.6%

1,034,633,390

100.6%

4,624,074,611

99.8%

391,644,021

101.3%

N/4/2

115,650,082

106.5%

173,110,541

100.2%

196,392,926

95.4%

696,533,900

99.6%

3,717,590,891

99.7%

263,232,887

96.8%

N/4/3

97,264,439

90.3%

192,504,643

105.0%

206,119,588

98.7%

501,287,156

97.8%

2,450,944,955

100.5%

117,104,111

99.8%

N/4/4

153,464,379

115.1%

195,703,696

97.5%

283,447,385

103.6%

660,958,667

98.5%

2,334,500,075

100.6%

114,833,821

92.9%

S/1/1

30,145,775

135.7%

28,847,772

94.8%

40,393,654

105.9%

91,257,938

95.5%

1,121,084,058

100.4%

3,053,295,879

100.4%

S/2/1

42,753,182

109.0%

58,621,979

94.9%

74,856,996

93.5%

177,694,957

94.1%

1,089,203,131

98.8%

467,391,621

102.0%

S/2/2

55,705,006

139.4%

55,852,780

94.4%

68,480,349

97.6%

160,693,564

95.8%

985,575,007

99.9%

378,627,023

103.1%

U/1/1

10,542,256

105.7%

4,761,704

71.4%

5,043,816

83.1%

8,605,933

73.9%

118,632,047

109.6%

350,453,223

102.9%

uw x ia_band1

uw

ia_band1: 18-24

ia_band1: 25-34

ia_band1: 35-44

ia_band1: 45-54

ia_band1: 55-64

ia_band1: 65-74

ia_band1: 75-84

ia_band1: 85-99

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

195,107,686

105.7%

1,468,766,511

96.5%

3,219,766,527

98.7%

3,573,783,086

98.8%

4,457,649,299

100.7%

4,148,815,415

101.4%

1,980,748,076

101.9%

158,987,409

98.6%

N/2/1

87,770,402

106.8%

804,579,965

100.9%

1,676,447,802

99.4%

1,660,839,957

99.4%

1,585,401,151

99.4%

2,098,151,266

100.0%

1,649,244,265

100.3%

99,333,245

102.4%

N/2/2

82,359,855

103.8%

734,969,884

101.6%

1,537,827,224

100.8%

1,935,817,830

100.6%

1,907,603,917

99.8%

2,320,063,956

99.7%

2,742,077,451

99.8%

285,811,029

101.4%

N/3/1

42,971,227

81.7%

542,310,521

102.1%

1,509,846,536

100.8%

1,517,047,115

99.7%

1,083,241,294

99.0%

682,592,797

98.6%

557,962,237

99.0%

14,400,000

101.3%

N/3/2

26,380,775

113.4%

339,627,773

103.4%

1,177,249,350

101.0%

1,552,812,212

99.3%

1,501,263,082

99.3%

1,589,866,999

99.4%

2,421,887,776

99.7%

201,880,588

101.3%

N/3/3

29,150,024

97.3%

391,634,706

103.1%

1,364,327,286

100.9%

1,932,181,705

100.6%

2,306,037,454

100.5%

2,995,484,451

99.7%

5,412,062,505

99.9%

539,471,926

99.2%

N/4/1

26,287,124

72.9%

602,385,596

100.0%

2,097,046,161

100.5%

2,253,038,301

99.9%

1,386,295,039

99.3%

431,665,885

98.0%

43,983,146

88.9%

2,257,869

114.2%

N/4/2

9,905,787

98.3%

233,732,519

96.5%

1,112,629,077

101.0%

1,595,224,194

99.3%

1,413,399,308

99.3%

448,165,240

98.0%

313,744,476

101.9%

35,710,626

92.6%

N/4/3

7,030,001

83.3%

209,711,793

105.0%

705,969,366

99.8%

1,147,207,102

100.9%

986,258,205

99.5%

356,988,649

97.7%

140,290,698

96.4%

11,769,078

82.7%

N/4/4

9,949,832

106.2%

167,099,067

106.8%

638,256,939

101.7%

1,003,674,951

101.1%

958,007,317

100.1%

551,614,283

99.1%

345,232,117

98.1%

69,073,517

95.1%

S/1/1

68,766,581

113.4%

451,737,170

97.4%

1,077,472,651

100.5%

1,226,674,054

100.6%

936,351,195

101.1%

461,642,247

99.8%

139,279,941

103.9%

3,101,237

72.1%

S/2/1

26,876,275

136.8%

179,096,556

94.0%

474,899,912

97.6%

608,218,812

100.7%

402,893,657

97.5%

174,002,125

101.9%

42,414,641

92.6%

2,119,888

188.8%

S/2/2

17,471,313

135.4%

117,414,041

102.5%

333,323,589

99.9%

526,461,474

102.3%

400,538,301

97.4%

233,737,976

103.7%

63,750,567

92.8%

12,236,468

118.0%

U/1/1

37,904,199

93.1%

24,461,713

113.1%

65,390,717

115.7%

96,529,522

109.8%

122,212,871

106.6%

116,046,227

94.9%

31,334,054

89.9%

4,159,676

89.9%

uw x gender

uw

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

N/1/1

6,871,313,340

99.6%

12,332,310,669

100.2%

N/2/1

3,739,559,716

99.3%

5,922,208,337

100.3%

N/2/2

4,668,956,646

100.1%

6,877,574,500

100.3%

N/3/1

1,982,219,430

99.8%

3,968,152,297

99.6%

N/3/2

3,036,089,596

99.0%

5,774,878,959

100.3%

N/3/3

5,764,067,198

100.8%

9,206,282,859

99.8%

N/4/1

1,822,625,798

99.6%

5,020,333,323

99.7%

N/4/2

997,342,369

95.9%

4,165,168,858

100.4%

N/4/3

561,153,990

96.5%

3,004,070,902

100.6%

N/4/4

894,420,215

100.1%

2,848,487,808

100.7%

S/1/1

1,618,467,456

102.6%

2,746,557,620

99.3%

S/2/1

534,567,286

96.0%

1,375,954,580

100.1%

S/2/2

553,360,607

106.3%

1,151,573,122

98.4%

U/1/1

213,048,288

102.6%

284,990,691

103.4%

uw x insurance_plan

uw

insurance_plan: Other

insurance_plan: Perm

insurance_plan: Term

insurance_plan: xL

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

41,338,662

125.0%

6,000,364,392

99.7%

3,545,626,085

97.6%

9,616,294,870

100.9%

N/2/1

10,081,665

102.2%

1,335,375,364

99.7%

2,232,053,273

100.2%

6,084,257,751

99.8%

N/2/2

15,784,451

88.8%

1,659,464,730

100.7%

3,001,480,499

100.8%

6,869,801,466

99.9%

N/3/1

40,123,942

233.9%

317,915,492

97.9%

3,500,287,255

99.4%

2,092,045,038

99.5%

N/3/2

20,508,375

93.9%

282,009,643

103.7%

3,266,885,153

99.8%

5,241,565,384

99.8%

N/3/3

32,085,711

112.9%

496,907,377

102.2%

4,052,652,325

100.0%

10,388,704,644

100.1%

N/4/1

2,300,000

30.0%

9,330,053

119.2%

6,251,798,169

99.8%

579,530,899

98.3%

N/4/2

7,223,579

69.2%

7,940,683

93.4%

4,382,666,145

99.8%

764,680,820

98.6%

N/4/3

103,684

2.1%

2,495,706

67.4%

3,036,110,841

100.3%

526,514,661

98.6%

N/4/4

3,739,130

21.5%

12,550,388

100.6%

2,742,467,039

100.8%

984,151,466

101.1%

S/1/1

3,801,542

139.7%

1,397,404,613

99.3%

1,062,735,519

102.2%

1,901,083,402

100.4%

S/2/1

6,544,922

114.8%

163,842,775

95.9%

1,131,185,266

99.6%

608,948,903

98.3%

S/2/2

3,059,454

35.1%

189,503,330

104.5%

778,511,625

102.8%

733,859,320

98.6%

U/1/1

657,183

79.6%

286,818,715

103.5%

39,156,859

100.6%

171,406,222

103.1%

uw x ltp

uw

ltp: 5 yr

ltp: 10 yr

ltp: 15 yr

ltp: 20 yr

ltp: 25 yr

ltp: 30 yr

ltp: Not Level Term

ltp: Unknown

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

167,693,145

123.9%

329,259,367

90.7%

313,430,210

106.1%

1,254,368,612

96.0%

22,798,990

125.3%

140,376,344

99.8%

15,709,773,948

100.6%

1,265,923,393

95.3%

N/2/1

109,944,549

110.2%

251,674,560

102.2%

154,929,926

95.4%

1,003,102,210

99.3%

54,969,322

96.1%

91,526,385

98.3%

7,455,632,156

99.8%

539,988,945

101.7%

N/2/2

101,292,185

102.6%

464,466,296

102.1%

319,758,321

101.1%

1,222,999,988

100.8%

275,278,999

102.8%

237,099,784

96.9%

8,573,983,079

99.9%

351,652,494

102.9%

N/3/1

41,020,000

115.3%

402,661,935

98.5%

437,170,279

98.0%

1,876,102,407

99.5%

32,892,500

82.8%

371,739,248

98.0%

2,700,444,573

100.3%

88,340,785

109.9%

N/3/2

62,023,000

151.0%

523,988,195

99.8%

476,579,573

98.2%

1,824,845,270

99.5%

12,110,000

64.4%

210,771,725

96.5%

5,612,039,277

99.9%

88,611,515

110.2%

N/3/3

59,980,899

84.1%

896,107,443

101.1%

652,611,790

100.7%

1,842,591,351

100.5%

16,475,000

87.4%

201,629,249

96.0%

11,161,374,076

100.1%

139,580,249

106.7%

N/4/1

4,410,000

209.3%

1,142,209,199

100.8%

1,124,192,978

99.2%

3,080,000,938

99.8%

36,372,000

111.3%

845,189,202

99.1%

591,162,804

97.7%

19,422,000

83.0%

N/4/2

985,000

49.1%

899,393,774

99.5%

918,062,758

99.3%

2,118,167,322

100.4%

14,432,100

86.9%

414,673,900

100.3%

779,845,082

98.2%

16,951,291

72.8%

N/4/3

300,000

41.4%

711,850,279

98.7%

663,787,225

100.8%

1,348,193,434

99.9%

8,775,000

133.1%

270,343,655

103.0%

529,114,051

97.6%

32,861,248

120.2%

N/4/4

1,900,000

83.0%

861,440,026

100.5%

549,878,289

101.8%

1,117,119,304

100.8%

8,832,928

86.8%

186,021,492

104.7%

1,000,440,984

99.7%

17,275,000

69.7%

S/1/1

77,649,707

101.0%

220,477,191

104.3%

92,140,018

92.4%

325,974,726

102.8%

37,348,584

94.1%

43,126,718

112.2%

3,324,217,468

100.0%

244,090,664

103.9%

S/2/1

35,606,411

82.4%

312,687,204

100.6%

190,844,063

99.6%

450,793,441

101.0%

2,925,000

97.0%

50,493,377

106.9%

784,528,034

98.0%

82,644,336

93.5%

S/2/2

19,448,975

86.9%

251,942,737

103.8%

122,991,130

100.2%

294,652,404

103.2%

1,050,000

58.1%

28,434,706

106.0%

932,515,030

99.0%

53,898,747

113.7%

U/1/1

4,089,643

105.2%

3,506,517

95.5%

1,380,000

26.5%

8,137,200

135.6%

962,499

33.6%

1,306,000

85.4%

459,244,973

103.3%

19,412,147

125.7%

uw x iy_band1

uw

iy_band1: 1900-1989

iy_band1: 1990-1999

iy_band1: 2000-2009

iy_band1: 2010+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

N/1/1

982,160,538

103.0%

12,534,732,161

99.7%

4,842,664,276

99.4%

844,067,034

104.5%

N/2/1

112,802,132

111.3%

4,655,974,027

100.2%

4,443,736,468

99.7%

449,255,426

96.4%

N/2/2

107,232,938

106.9%

4,512,424,274

100.3%

6,312,557,239

100.2%

614,316,695

98.8%

N/3/1

282,000

22.4%

1,044,203,844

99.2%

3,887,317,728

99.7%

1,018,568,155

100.5%

N/3/2

250,000

15.1%

1,008,204,301

101.2%

6,934,988,157

99.9%

867,526,097

98.0%

N/3/3

4,055,561

33.6%

755,727,777

99.6%

12,308,187,396

100.1%

1,902,379,323

101.1%

N/4/1

0

0.0%

394,664,512

97.8%

4,606,160,597

99.7%

1,842,134,012

99.8%

N/4/2

308,834,925

97.3%

3,702,484,884

99.7%

1,151,191,418

99.5%

N/4/3

158,854,990

103.3%

2,386,096,189

99.5%

1,020,273,713

100.3%

N/4/4

0

0.0%

146,243,472

98.4%

2,314,552,528

100.3%

1,282,112,023

101.2%

S/1/1

212,607,764

102.1%

2,836,569,692

100.1%

1,124,908,213

101.2%

190,939,407

100.5%

S/2/1

7,889,930

105.4%

479,056,419

102.4%

1,077,182,521

98.8%

346,392,996

94.8%

S/2/2

7,665,540

130.5%

388,469,845

102.9%

986,492,416

101.4%

322,305,928

96.2%

U/1/1

29,233,111

112.9%

320,187,107

102.1%

117,964,696

109.5%

30,654,065

85.2%

face_amount_band x dur_band1

face_amount_band

dur_band1: 01

dur_band1: 02

dur_band1: 03

dur_band1: 04-05

dur_band1: 06-15

dur_band1: 16-25

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

01 - 0 - 49,999

37,859,834

193.5%

46,746,236

142.4%

59,349,580

141.6%

129,333,013

124.3%

856,334,359

108.6%

3,064,186,050

100.3%

04 - 50,000 - 99,999

52,511,061

130.3%

66,787,618

108.1%

90,269,149

111.6%

206,424,919

107.5%

1,469,057,939

98.8%

3,397,973,888

99.5%

05 - 100,000 - 249,999

191,101,070

106.9%

269,652,985

97.5%

371,980,209

102.4%

855,544,316

97.8%

7,074,965,104

99.7%

7,477,763,920

99.8%

06 - 250,000 - 499,999

209,469,031

100.9%

304,453,835

96.4%

416,690,820

100.8%

1,000,438,105

98.2%

8,109,440,540

99.7%

4,705,111,802

100.1%

07 - 500,000 - 999,999

215,479,316

92.7%

350,555,118

99.8%

461,397,659

99.1%

1,167,345,786

99.0%

8,791,983,362

99.8%

4,193,982,866

100.4%

08 - 1,000,000+

562,715,983

98.9%

814,571,289

98.3%

1,094,470,475

98.7%

2,962,359,193

100.0%

29,375,708,984

100.0%

7,481,721,046

100.2%

face_amount_band x ia_band1

ia_band1

face_amount_band: 01 - 0 - 49,999

face_amount_band: 04 - 50,000 - 99,999

face_amount_band: 05 - 100,000 - 249,999

face_amount_band: 06 - 250,000 - 499,999

face_amount_band: 07 - 500,000 - 999,999

face_amount_band: 08 - 1,000,000+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

18-24

75,552,784

127.6%

133,231,819

113.5%

220,299,419

97.6%

115,618,404

89.3%

61,754,428

97.5%

61,474,227

112.3%

25-34

192,799,300

136.6%

458,412,108

103.6%

1,476,938,558

98.7%

1,398,298,659

98.7%

1,378,321,244

98.7%

1,362,757,946

98.7%

35-44

421,318,794

123.4%

898,534,810

101.8%

3,004,123,357

99.4%

3,258,823,688

99.5%

3,627,163,035

99.5%

5,780,489,453

99.7%

45-54

861,129,196

108.8%

1,234,583,452

99.7%

4,027,597,384

99.5%

3,776,175,782

99.5%

3,750,396,726

99.5%

6,979,627,775

99.8%

55-64

1,371,789,823

99.8%

1,302,388,605

98.8%

4,041,501,912

99.8%

3,254,795,121

100.1%

2,876,205,825

99.7%

6,600,470,804

100.1%

65-74

1,035,580,594

95.6%

950,356,588

98.8%

2,489,528,859

100.6%

1,929,056,738

100.7%

2,107,668,820

100.6%

8,096,645,917

100.2%

75-84

221,626,122

94.5%

285,327,929

96.9%

901,837,477

101.1%

915,794,848

101.0%

1,257,117,333

100.6%

12,302,308,241

99.9%

85-99

14,012,459

80.9%

20,189,263

86.9%

79,180,638

100.0%

97,040,893

101.9%

122,116,696

103.3%

1,107,772,607

99.6%

face_amount_band x gender

face_amount_band

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

01 - 0 - 49,999

2,082,160,405

101.0%

2,111,648,667

106.5%

04 - 50,000 - 99,999

2,075,208,647

99.0%

3,207,815,927

100.9%

05 - 100,000 - 249,999

5,592,847,725

99.5%

10,648,159,879

99.8%

06 - 250,000 - 499,999

4,715,664,348

99.8%

10,029,939,785

99.7%

07 - 500,000 - 999,999

4,435,784,376

99.8%

10,744,959,731

99.7%

08 - 1,000,000+

14,355,526,434

100.0%

27,936,020,536

99.9%

face_amount_band x insurance_plan

face_amount_band

insurance_plan: Other

insurance_plan: Perm

insurance_plan: Term

insurance_plan: xL

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

01 - 0 - 49,999

4,987,841

101.9%

2,341,324,220

102.9%

430,107,626

113.7%

1,417,389,385

102.3%

04 - 50,000 - 99,999

5,666,783

104.3%

1,269,348,289

98.8%

1,013,673,047

103.1%

2,994,336,455

99.7%

05 - 100,000 - 249,999

34,041,514

113.1%

2,315,213,211

99.4%

6,874,173,321

99.6%

7,017,579,558

99.9%

06 - 250,000 - 499,999

28,639,038

132.3%

1,510,214,779

99.0%

8,348,184,153

99.6%

4,858,566,163

100.0%

07 - 500,000 - 999,999

18,132,026

94.5%

1,434,005,349

99.0%

8,575,363,506

99.6%

5,153,243,226

100.2%

08 - 1,000,000+

95,885,098

91.2%

3,291,817,413

99.8%

13,782,114,400

99.8%

25,121,730,059

100.0%

face_amount_band x ltp

ltp

face_amount_band: 01 - 0 - 49,999

face_amount_band: 04 - 50,000 - 99,999

face_amount_band: 05 - 100,000 - 249,999

face_amount_band: 06 - 250,000 - 499,999

face_amount_band: 07 - 500,000 - 999,999

face_amount_band: 08 - 1,000,000+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

5 yr

40,106,609

113.4%

108,758,076

106.5%

223,852,354

106.4%

134,040,438

109.8%

88,356,037

115.8%

91,230,000

102.5%

10 yr

79,471,212

101.4%

152,307,148

101.6%

1,212,386,424

98.8%

1,312,587,882

101.1%

1,417,042,517

101.0%

3,097,869,540

99.6%

15 yr

63,154,393

107.0%

171,454,282

92.8%

1,115,915,003

98.7%

1,284,208,527

99.2%

1,254,669,304

100.8%

2,128,355,051

100.6%

20 yr

113,505,987

120.7%

278,179,336

104.4%

2,893,988,498

99.5%

3,894,785,171

99.6%

4,168,688,693

99.7%

6,417,900,922

99.8%

25 yr

9,580,391

164.9%

26,792,196

123.0%

154,206,000

101.4%

162,940,708

96.7%

101,887,627

90.5%

69,816,000

95.4%

30 yr

6,329,793

169.8%

31,267,129

124.0%

436,286,383

103.0%

721,667,111

98.3%

809,282,434

98.5%

1,087,898,935

99.0%

Not Level Term

3,802,855,700

102.8%

4,317,239,186

99.4%

9,477,900,932

99.7%

6,615,106,546

99.9%

6,758,480,601

99.6%

28,642,732,570

100.0%

Unknown

78,804,987

115.8%

197,027,221

106.3%

726,472,010

99.4%

620,267,750

97.8%

582,336,894

97.8%

755,743,952

99.0%

face_amount_band x iy_band1

face_amount_band

iy_band1: 1900-1989

iy_band1: 1990-1999

iy_band1: 2000-2009

iy_band1: 2010+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

01 - 0 - 49,999

267,683,109

95.6%

2,808,814,596

100.9%

833,122,239

108.0%

284,189,128

135.7%

04 - 50,000 - 99,999

267,216,839

98.0%

3,149,056,719

99.6%

1,444,089,389

99.6%

422,661,627

107.2%

05 - 100,000 - 249,999

404,345,538

100.0%

7,205,856,453

99.9%

6,896,526,088

99.7%

1,734,279,525

99.1%

06 - 250,000 - 499,999

158,023,865

101.7%

4,708,333,873

100.3%

7,919,827,744

99.7%

1,959,418,651

98.4%

07 - 500,000 - 999,999

136,967,257

119.4%

4,219,050,921

99.8%

8,611,625,509

99.8%

2,213,100,420

98.6%

08 - 1,000,000+

229,942,906

120.0%

7,453,034,784

99.8%

29,340,102,339

99.9%

5,268,466,941

99.4%

dur_band1 x ia_band1

ia_band1

dur_band1: 01

dur_band1: 02

dur_band1: 03

dur_band1: 04-05

dur_band1: 06-15

dur_band1: 16-25

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

18-24

38,014,843

86.3%

30,257,763

72.7%

27,602,930

83.2%

48,778,652

91.5%

247,658,415

116.5%

275,618,478

104.0%

25-34

134,511,109

99.3%

163,517,330

100.5%

194,704,716

105.6%

424,147,771

104.2%

2,807,372,903

99.5%

2,543,273,986

99.3%

35-44

290,877,481

122.9%

372,345,417

104.0%

502,388,214

102.9%

1,120,149,604

98.4%

8,625,261,989

99.8%

6,079,430,432

99.7%

45-54

284,754,759

90.1%

443,414,694

96.9%

668,187,442

102.2%

1,648,875,363

99.6%

10,888,781,617

100.1%

6,695,496,440

100.3%

55-64

316,638,372

99.8%

537,383,357

102.7%

610,988,925

97.8%

1,580,728,594

98.9%

9,237,741,705

99.8%

7,163,671,137

100.2%

65-74

143,303,417

95.4%

240,017,175

96.0%

343,292,965

100.2%

930,432,092

101.6%

8,689,000,890

99.8%

6,262,790,977

100.2%

75-84

45,633,095

117.0%

49,322,782

85.7%

121,205,864

98.0%

506,866,160

102.0%

13,883,785,714

100.0%

1,277,198,335

99.5%

85-99

15,403,219

171.3%

16,508,563

101.1%

25,786,836

110.5%

61,467,096

91.9%

1,297,887,055

99.5%

23,259,787

90.2%

dur_band1 x gender

dur_band1

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

01

322,942,506

116.2%

946,193,789

97.5%

02

424,637,814

91.6%

1,428,129,267

101.7%

03

697,046,249

101.1%

1,797,111,643

100.7%

04-05

1,788,538,549

97.9%

4,532,906,783

100.6%

06-15

19,652,776,350

99.8%

36,024,713,938

100.1%

16-25

10,371,250,467

100.2%

19,949,489,105

99.9%

dur_band1 x insurance_plan

dur_band1

insurance_plan: Other

insurance_plan: Perm

insurance_plan: Term

insurance_plan: xL

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

01

6,987,296

74.8%

115,387,649

123.3%

816,300,396

96.9%

330,460,954

109.3%

02

12,956,089

97.1%

186,759,670

107.8%

1,193,573,840

98.8%

459,477,482

97.1%

03

10,940,700

66.9%

173,260,241

105.6%

1,539,568,218

100.2%

770,388,733

101.8%

04-05

41,382,619

117.0%

351,202,533

96.0%

3,960,972,351

100.4%

1,967,887,829

99.1%

06-15

92,001,723

94.9%

2,251,458,889

99.4%

24,546,892,971

99.9%

28,787,136,705

100.1%

16-25

23,083,873

153.3%

9,083,854,279

99.8%

6,966,308,277

100.1%

14,247,493,143

100.1%

dur_band1 x ltp

ltp

dur_band1: 01

dur_band1: 02

dur_band1: 03

dur_band1: 04-05

dur_band1: 06-15

dur_band1: 16-25

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

5 yr

8,474,532

141.6%

5,768,265

73.7%

13,120,000

106.1%

47,280,625

109.3%

258,887,272

107.3%

352,812,820

108.7%

10 yr

250,039,805

98.0%

374,445,553

99.4%

463,240,176

98.8%

1,216,385,768

98.8%

4,556,704,077

100.4%

410,849,344

103.6%

15 yr

106,613,913

100.1%

154,795,986

95.4%

219,264,610

105.5%

520,402,972

97.4%

4,466,179,563

99.6%

550,499,516

102.9%

20 yr

314,198,709

99.9%

447,558,227

99.0%

600,176,032

100.9%

1,543,338,839

100.5%

11,599,090,681

99.9%

3,262,686,119

99.5%

25 yr

3,615,999

44.5%

13,707,800

129.4%

18,921,628

118.2%

49,784,528

127.3%

267,598,048

95.6%

171,594,919

95.2%

30 yr

66,053,409

84.1%

98,748,126

100.4%

130,459,928

96.2%

388,766,012

103.3%

1,983,373,230

100.1%

425,331,080

97.4%

Not Level Term

453,235,980

111.8%

659,812,994

100.0%

954,906,252

101.9%

2,361,669,130

98.9%

31,616,749,475

100.0%

23,567,941,704

100.0%

Unknown

66,903,948

90.9%

97,930,130

98.4%

94,069,266

92.6%

193,817,458

105.2%

928,907,942

99.3%

1,579,024,070

99.7%

dur_band1 x iy_band1

dur_band1

iy_band1: 2010+

iy_band1: 2000-2009

iy_band1: 1990-1999

iy_band1: 1900-1989

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

01

1,269,136,295

101.7%

02

1,852,767,081

99.2%

03

2,494,157,892

100.8%

04-05

4,479,996,408

99.3%

1,841,448,924

101.3%

06-15

1,786,058,616

99.9%

51,370,835,971

99.9%

2,520,595,701

100.4%

16-25

1,833,008,413

98.8%

27,023,551,645

100.0%

1,464,179,514

103.2%

ia_band1 x gender

ia_band1

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

18-24

210,351,040

93.6%

457,580,041

107.6%

25-34

1,919,168,906

98.3%

4,348,358,909

100.6%

35-44

4,376,241,525

99.9%

12,614,211,612

100.2%

45-54

4,611,772,525

99.3%

16,017,737,790

100.2%

55-64

4,809,017,158

100.0%

14,638,134,932

99.8%

65-74

7,381,394,530

100.2%

9,227,442,986

99.7%

75-84

9,045,279,898

100.1%

6,838,732,052

99.8%

85-99

903,966,353

100.3%

536,346,203

98.7%

ia_band1 x insurance_plan

ia_band1

insurance_plan: Other

insurance_plan: Perm

insurance_plan: Term

insurance_plan: xL

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

18-24

708,665

57.4%

166,230,095

97.5%

267,466,700

100.1%

233,525,621

110.7%

25-34

3,256,794

65.1%

852,327,846

101.7%

4,057,228,956

99.2%

1,354,714,219

101.1%

35-44

21,307,402

110.1%

1,846,365,837

100.1%

11,715,945,894

100.1%

3,406,834,004

100.4%

45-54

57,525,597

129.7%

2,565,363,967

99.5%

12,641,031,673

99.9%

5,365,589,078

100.3%

55-64

47,144,955

81.0%

3,306,501,193

99.6%

8,357,779,097

99.9%

7,735,726,845

100.1%

65-74

35,749,661

103.7%

2,847,699,847

100.0%

1,896,982,781

100.4%

11,828,405,227

99.9%

75-84

21,392,494

97.5%

492,087,214

102.3%

86,369,648

101.6%

15,284,162,594

99.9%

85-99

266,732

14.7%

85,347,262

107.3%

811,304

108.2%

1,353,887,258

99.3%

ia_band1 x ltp

ia_band1

ltp: 5 yr

ltp: 10 yr

ltp: 15 yr

ltp: 20 yr

ltp: 25 yr

ltp: 30 yr

ltp: Not Level Term

ltp: Unknown

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

18-24

17,726,831

95.3%

32,009,383

100.3%

3,343,000

42.3%

80,676,936

111.9%

13,002,500

98.0%

48,824,425

104.2%

403,156,010

104.9%

69,191,996

92.5%

25-34

147,704,951

111.5%

317,080,031

104.5%

150,181,284

91.9%

1,670,129,784

99.7%

164,417,165

93.5%

748,387,532

98.2%

2,274,993,951

101.0%

794,633,117

98.3%

35-44

232,956,798

104.0%

1,049,894,842

101.4%

983,755,234

101.4%

6,112,699,729

99.8%

222,687,940

95.1%

1,620,869,320

100.2%

5,508,205,763

100.4%

1,259,383,511

99.0%

45-54

178,555,603

110.1%

2,312,649,990

100.2%

2,156,659,786

99.4%

6,421,617,312

99.8%

95,336,910

109.7%

652,477,911

99.1%

8,212,889,839

100.0%

599,322,964

100.0%

55-64

76,414,209

103.6%

2,489,338,251

99.5%

2,056,964,977

99.7%

3,341,836,281

100.2%

27,176,407

125.5%

22,072,597

100.9%

11,252,197,601

99.8%

181,151,767

104.1%

65-74

27,015,122

120.6%

1,014,083,731

99.1%

662,777,279

102.0%

140,088,565

96.2%

2,602,000

127.2%

100,000

44.3%

14,725,682,019

99.9%

36,488,800

108.5%

75-84

5,970,000

330.3%

56,608,495

89.6%

4,065,000

89.8%

0

0.0%

15,797,689,100

100.0%

19,679,355

130.4%

85-99

0

0.0%

10,000

7922.6%

1,439,501,252

99.7%

801,304

109.2%

ia_band1 x iy_band1

ia_band1

iy_band1: 1900-1989

iy_band1: 1990-1999

iy_band1: 2000-2009

iy_band1: 2010+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

18-24

21,957,490

108.3%

248,374,326

101.8%

249,952,870

116.4%

147,646,395

86.5%

25-34

156,424,882

98.9%

2,415,663,468

99.5%

2,774,706,986

99.2%

920,732,479

103.3%

35-44

323,624,196

104.4%

5,875,528,027

99.7%

8,554,094,758

100.0%

2,237,206,156

101.3%

45-54

340,733,447

101.3%

6,605,000,254

100.2%

10,642,985,328

99.9%

3,040,791,286

99.4%

55-64

443,897,049

104.0%

6,912,220,124

100.0%

8,972,404,806

99.8%

3,118,630,111

99.4%

65-74

172,472,214

106.2%

6,127,207,226

100.2%

8,704,756,697

100.0%

1,604,401,379

98.4%

75-84

5,070,236

115.2%

1,325,459,599

99.5%

13,867,379,988

99.9%

686,102,127

102.4%

85-99

0

0.0%

34,694,322

93.4%

1,279,011,875

99.5%

126,606,359

102.8%

gender x insurance_plan

insurance_plan

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

Other

76,199,638

107.7%

111,152,662

96.1%

Perm

4,259,680,761

99.5%

7,902,242,500

100.3%

Term

8,746,195,310

99.2%

30,277,420,743

100.1%

xL

20,175,116,226

100.2%

26,387,728,620

100.0%

gender x ltp

ltp

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

5 yr

233,647,378

113.8%

452,696,136

105.3%

10 yr

1,294,444,561

101.9%

5,977,220,162

99.7%

15 yr

1,152,920,343

97.1%

4,864,836,217

100.5%

20 yr

4,067,960,544

98.9%

13,699,088,063

100.2%

25 yr

164,969,732

97.3%

360,253,190

98.8%

30 yr

911,017,939

96.5%

2,181,713,846

100.9%

Not Level Term

24,668,222,437

100.1%

34,946,093,098

100.0%

Unknown

764,009,001

99.3%

2,196,643,813

99.5%

gender x iy_band1

iy_band1

gender: F

gender: M

Outcome

Ratio

Outcome

Ratio

1900-1989

415,833,187

105.2%

1,048,346,327

102.4%

1990-1999

10,124,056,124

100.2%

19,420,091,222

99.9%

2000-2009

19,505,796,137

99.8%

35,539,497,171

100.0%

2010+

3,211,506,487

98.4%

8,670,609,805

100.5%

insurance_plan x ltp

ltp

insurance_plan: Term

insurance_plan: Other

insurance_plan: Perm

insurance_plan: xL

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

5 yr

686,343,514

108.0%

10 yr

7,271,664,723

100.1%

15 yr

6,017,756,560

99.8%

20 yr

17,767,048,607

99.9%

25 yr

525,222,922

98.4%

30 yr

3,092,731,785

99.5%

Not Level Term

702,195,128

97.4%

187,352,300

100.5%

12,161,923,261

100.0%

46,562,844,846

100.1%

Unknown

2,960,652,814

99.4%

insurance_plan x iy_band1

insurance_plan

iy_band1: 1900-1989

iy_band1: 1990-1999

iy_band1: 2000-2009

iy_band1: 2010+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Other

699,086

15.1%

24,035,150

211.0%

95,671,569

96.6%

66,946,495

93.8%

Perm

685,297,227

99.2%

8,414,471,562

99.9%

2,230,633,550

100.5%

831,520,922

100.9%

Term

110,349,436

113.7%

7,507,786,831

100.1%

24,023,250,728

99.9%

7,382,229,058

99.5%

xL

667,833,765

106.6%

13,597,853,803

99.9%

28,695,737,461

99.9%

3,601,419,817

100.8%

ltp x iy_band1

ltp

iy_band1: 1900-1989

iy_band1: 1990-1999

iy_band1: 2000-2009

iy_band1: 2010+

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

Outcome

Ratio

5 yr

11,828,904

103.9%

356,634,986

109.6%

256,413,452

107.5%

61,466,172

102.3%

10 yr

7,313,258

102.3%

395,337,236

102.3%

4,556,557,084

100.1%

2,312,457,145

99.5%

15 yr

770,000

43.9%

786,184,143

100.2%

4,222,555,938

99.6%

1,008,246,479

100.5%

20 yr

27,216,551

142.4%

3,469,245,630

99.6%

11,456,767,739

100.0%

2,813,818,687

99.6%

25 yr

0

0.0%

162,532,838

94.9%

278,353,329

97.4%

84,336,755

109.5%

30 yr

0

0.0%

553,191,034

98.0%

1,883,712,240

100.4%

655,828,511

98.5%

Not Level Term

1,355,860,609

102.4%

22,279,426,292

99.9%

31,476,742,520

99.9%

4,502,286,114

100.7%

Unknown

61,190,192

110.9%

1,541,595,187

100.8%

914,191,006

98.4%

443,676,429

95.7%

Comparison of Model Predictions

Goodness of Fit

It is important to compare model performance on the test dataset. Models tend to fit well on the training data.

We compute the MSE, MAD, and Poisson deviance for each model on the test dataset. Models with lower values are considered qualitatively better.

Across all measures, the elastic net GLM model has the lowest deviation, with the LightGBM qualitatively not far behind. The main-effects GLM does not compete, which reinforces the need for some accommodation of interactions.

#-----------------------------------------#
##### Table of Results #####
#-----------------------------------------#

## creates a table of results for mse, mae, dev for the models

rbind(test[,val(get(resp_var),predictions_glm,get(resp_offset))],
                 test[,val(get(resp_var),predictions_glmnet,get(resp_offset))],
                 test[,val(get(resp_var),predictions_lgbm1,get(resp_offset))]) %>%
  set_rownames(c("glm","glmnet","lgbm")) %>%
  rownames_to_column(var="model") %>%
  as.data.table() %>%
  as_flextable()

model

mse

mae

dev

glm

657,407,396,611,204

10,369,070

1,552,142,707,959,196

glmnet

285,309,606,550,480

8,470,467

723,646,527,740,728

lgbm

355,563,440,625,343

8,741,101

909,129,632,590,165

Graphical Model Comparison

Unlike the GLM, neither the LightGBM nor the penalized GLM provide any information regarding parameter uncertainty. For elastic net GLMs, there are options to estimate parameter uncertainty:

  1. Move to a fully Bayesian setting. This gives the modeler significant control, at the cost of complexity (e.g., how to choose reasonable priors) and computation cost. Stan and INLA are available for this purpose.
  2. Apply the method in Tibshirani et al’s A significance test for the lasso. This requires rerunning penalized GLMs and is thus potentially costly.
  3. Apply the method in Lederer’s Fundamentals of High-Dimensional Statistics, Sec. 5.2. While technically involved, there does not seem to be a heavy computational lift.

To get around the limitations of assessing uncertainty for now, we plot the models versus the envelope of uncertainty arising from the data itself. This shifts the point of view from assessing parameter uncertainty to assessing goodness-of-fit.

Below are plots of how the model performs versus marginal effects, with performance tested on the test subset. Black dots with error bars are from the actual-to-2015VBT ratio, with error bar width based on the dispersion from the GLM model. (Caution: this is at best a crude approximation.)

The following colors denote specific predictive model ratios versus the 2015 VBT:

  • Red uses GLM predicted claims
  • Blue uses predicted claims from the elastic net GLM
  • Green uses LightGBM predicted claims

Broad observations:

  • In 2017, some of the average relationships shifted versus 2013-2016. This can be seen by noting the model dots resting outside the error bars.
  • The elastic net model may be missing some higher order interactions.
#-----------------------------------------#
##### Model Feature Plots #####
#-----------------------------------------#

## Select the top `nUseTopLightGBMInteractions` interactions from `imp.int2`
## and transform the `Feature1` and `Feature2` columns into a long format,
## ensuring each feature appears only once in the resulting list.

imp.int2 %>% 
  head(nUseTopLightGBMInteractions) %>%           # Select the top interactions
  select(Feature1, Feature2) %>%                  # Select the relevant columns
  pivot_longer(cols = c(Feature1, Feature2),      # Pivot to long format
               names_to = NULL,
               values_to = "Feature") %>%        
  distinct() %>%                                  # Remove duplicate features
  pull(Feature) -> int.subset                     # Extract the features into `int.subset`

## Calculate the dispersion of the GLM model
glm_disp <- sum(modelGLM$residuals^2 * modelGLM$weights)/modelGLM$df.residual

## Select the top `nUseTopLightGBMInteractions` interactions from `imp.int2` again
## to use as input for the `map2` function
int.subset <- imp.int2 %>% 
  head(nUseTopLightGBMInteractions) %>%
  select(Feature1,Feature2)

## Apply a function to each pair of features in `int.subset` to generate plots
map2(
  .x = int.subset$Feature1,                       # First feature in the pair
  .y = int.subset$Feature2,                       # Second feature in the pair
  .f = function(s1, s2) {                         # Function to apply
    test[,                                        # Subset the data
         .(
           predicted_glm   = sum(predictions_glm)/sum(amount_2015vbt),
           predicted_glmnet= sum(predictions_glmnet)/sum(amount_2015vbt),
           predicted_lgbm1 = sum(predictions_lgbm1)/sum(amount_2015vbt),
           a_e             = sum(amount_actual)/sum(amount_2015vbt),
           stde            = sqrt(sum(amount_actual)*glm_disp)/sum(amount_2015vbt)
         ),
         by=c(s1,s2)]  %>%
      setnames(s1,"x") %>%
      setnames(s2,"byvar") %>%
      mutate(
        x = fct_relevel(
          x,
          sort(levels(x))
        ),
        byvar = fct_relevel(
          byvar,
          sort(levels(byvar))
        )
      ) %>%
      as.data.table() %>%
     ggplot(aes(x = x, y = a_e)) +                # Create a ggplot with `x` on the x-axis and `a_e` on the y-axis
      facet_wrap(vars(byvar)) +                   # Create separate panels for each value of `byvar`
      geom_point() +                              # Add points for actual values
      geom_errorbar(aes(ymin = a_e - 1.96 * stde, ymax = a_e + 1.96 * stde)) + # Add error bars
      geom_hline(yintercept = 1, linetype = 2) +  # Add a horizontal line at y = 1
      geom_point(aes(y = predicted_glm), color = "red") +   # Add points for GLM predictions
      geom_point(aes(y = predicted_glmnet), color = "blue") + # Add points for GLMNet predictions
      geom_point(aes(y = predicted_lgbm1), color = "green") + # Add points for LightGBM predictions
      scale_y_continuous(name = "Factor", labels = scales::percent, trans = "log") + # Log scale for y-axis
      scale_x_discrete(name = s1) +               # Name the x-axis after the first feature
      theme_minimal() +                           # Use a minimal theme
      theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))  # Rotate x-axis labels
  }
) %>%
  purrr::set_names(
    int.subset %>% 
      mutate(Features=paste0( Feature1, " x ", Feature2)) %>% 
      select(Features) %>%
      unlist
    ) %>%
  iwalk(~ {
    cat('### ', .y, '\n\n')   # Print the feature pair as a section header
    print(.x)                 # Print the plot
    cat('\n\n')               # Add some spacing
  } )

ia_band1 x uw

face_amount_band x uw

face_amount_band x ia_band1

dur_band1 x face_amount_band

dur_band1 x uw

dur_band1 x ia_band1

face_amount_band x gender

dur_band1 x ltp

face_amount_band x ltp

ltp x uw

face_amount_band x iy_band1

ia_band1 x ltp

insurance_plan x uw

gender x ia_band1

ia_band1 x iy_band1

gender x uw

face_amount_band x insurance_plan

ia_band1 x insurance_plan

iy_band1 x uw

dur_band1 x gender

insurance_plan x ltp

dur_band1 x iy_band1

dur_band1 x insurance_plan

gender x insurance_plan

iy_band1 x ltp

insurance_plan x iy_band1

gender x iy_band1

gender x ltp

Mortality Differences by Product Use Case

Observations from the Raw Data

When assessing differences by product, it is not hard to find challenges when looking at the raw, unadjusted data.

One example is that there is virtually no 4-class non-smoker exposure in Perm, while there is significant exposure in Term. This implies that there is a potential issue of identifiability in interactions between insurance plan and underwriting due to the imbalance in exposures. This manifests as an apparent instability in calibrations.

For example, the marginal difference between the 4-class and 2-class non-smokers in the data is 82.42% (78.2%/94.9%), while the marginal difference between Term and Perm is 86.5% (83.4%/96.4%).

ds[,
             c("Smoker_Status","NClasses","Class"):=tstrsplit(uw,"/")]

ds[Smoker_Status=="N",.(A_2015VBT=sum(get(resp_var))/sum(get(resp_offset))),
   by=.(NClasses)][order(NClasses)] %>%
  flextable() %>%
  set_header_labels(NClasses="No. of Pref. Classes",
                    A_2015VBT="A/2015VBT") %>%
  set_formatter(values = function(x) {
        if(is.numeric(x))
          sprintf( "%.1f%%", x*100 )
        else
          x
        }
        )

No. of Pref. Classes

A/2015VBT

1

97.2%

2

94.9%

3

81.4%

4

78.2%

ds[,.(A_2015VBT=sum(get(resp_var))/sum(get(resp_offset))),by=.(insurance_plan)] %>%
  flextable() %>%
  set_header_labels(insurance_plan="Insurance Plan",
                    A_2015VBT="A/2015VBT") %>%
  set_formatter(values = function(x) {
        if(is.numeric(x))
          sprintf( "%.1f%%", x*100 )
        else
          x
        }
        )

Insurance Plan

A/2015VBT

Term

83.4%

xL

89.2%

Perm

96.4%

Other

84.1%

By way of comparison, the GLM calibrates 86.8% for Term versus Perm, and the weighted average factors for 4-class systems from the GLM model is 79.3% versus 95.7% for 2-class, for a ratio of 82.8%. The main effects GLM is therefore asserting that both conditions are associated with lower mortality.

For the elastic net GLM, the situation is complicated. All in, there are 52 factors which mention insurance plan, and assessing when perm and term differ is challenging on a bare reading of the factor table.

reformatCoefs(cvfit, pred_cols)  %>%
  filter(Coef != 0) %>%
  select(Feature1Name,
         Feature1Level,
         Feature2Name,
         Feature2Level,
         Coef) %>%
  mutate(Coef=exp(Coef)) %>%
  filter(
    Feature1Name == "insurance_plan" | Feature2Name == "insurance_plan"
  ) %>%
  select(Feature2Name, Feature2Level,
         Feature1Name, Feature1Level,
         Coef) %>%
  arrange(Feature2Name, Feature2Level,
         Feature1Name, Feature1Level) %>%
  flextable() %>%
  set_header_labels(Coef="Factor") %>%
  set_formatter(values = function(x) {
        if(is.numeric(x))
          sprintf( "%.1f%%", x*100 )
        else
          x
        }
        )

For the LightGBM model, there is arguably no interesting mean difference for between Perm and Term.

data.table(
    insurance_plan=train[shp_int_subset,insurance_plan],
    shap=shp$S[,which(names(shp$X)=="insurance_plan")] + shp$baseline,
    response=train[shp_int_subset,get(resp_var)],
    offset=train[shp_int_subset,get(resp_offset)]
) %>%
  group_by(insurance_plan) %>%
  summarize(M_T=sum(exp(shap)*offset)/sum(offset)) %>%
  flextable() %>%
  set_header_labels(insurance_plan="Insurance Plan",
                    M_T="Model / 2015VBT") %>%
  set_formatter(values = function(x) {
        if(is.numeric(x))
          sprintf( "%.1f%%", x*100 )
        else
          x
        }
        )

Insurance Plan

Model / 2015VBT

Other

67.0%

Perm

91.3%

Term

93.3%

xL

102.9%

For class system, the LightGBM model is illustrating a substantial reduction in mean mortality for the 4-class systems relative to 2-class systems.

shaps.uw <- data.table(
    uw=train[shp_int_subset,uw],
    shap=shp$S[,"uw"] + shp$baseline,
    response=train[shp_int_subset,get(resp_var)],
    offset=train[shp_int_subset,get(resp_offset)]
) 

shaps.uw[,
             c("Smoker_Status","NClasses","Class"):=tstrsplit(uw,"/")]

shaps.uw %>%
  group_by(NClasses) %>%
  summarize(M_T=sum(exp(shap)*offset)/sum(offset)) %>%
  flextable() %>%
  set_header_labels(NClasses="No. of Pref. Classes",
                    M_T="Model / 2015VBT") %>%
  set_formatter(values = function(x) {
        if(is.numeric(x))
          sprintf( "%.1f%%", x*100 )
        else
          x
        }
        )

No. of Pref. Classes

Model / 2015VBT

1

104.4%

2

100.9%

3

101.0%

4

86.2%

All of this strongly suggests the need for more sophisticated analysis.

Observations from the GLM

One question of interest to actuaries is why different products have different mortality outcomes. Many things could contribute to the difference, such as UW practice, anti-selection risk level, market segment, etc., and generally, it is hard to quantify their impact. With the GLM model and relevant analysis, we have a possible solution.

Let us revisit the table output with insurance plan as the predictor of interest.

## function that calls individual predictor, e.g., insurance plans. 
mainF(df     = ds,
      model  = modelGLM,
      rf     = "insurance_plan",
      resp   = resp_var,
      offset = resp_offset) %>%
  flextable() %>%
  set_header_labels("rowname" = "") %>%
  set_formatter(values = ~ if(is.numeric(.)) sprintf("%.1f%%", . * 100) else .) %>%
  set_caption(caption = paste0("Weighted Average GLM Factors for Variable: insurance_plan"))
Weighted Average GLM Factors for Variable: insurance_plan

Other

Perm

Term

xL

amount_2015vbt

0.8411

0.9638

0.8340

0.8923

Factor: insurance_plan

1.0000

0.8749

0.7596

0.9590

Ave Fac: dur_band1

0.9306

0.9098

0.9102

0.9024

Ave Fac: face_amount_band

0.7300

0.7913

0.7339

0.7366

Ave Fac: gender

1.0067

1.0070

1.0082

1.0060

Ave Fac: ia_band1

0.9124

0.9220

0.9325

0.8870

Ave Fac: iy_band1

0.8452

0.9105

0.8636

0.8741

Ave Fac: ltp

0.6733

0.6733

0.8455

0.6733

Ave Fac: uw

0.9790

1.0127

0.9051

1.0025

For illustration, let us select Perm and Term for pairwise comparison. By A/15VBT, Perm (96.4%) seems to have better mortality than Term (83.4%). Is this due to “product differences”?

mainF(df = ds, 
      model  = modelGLM, 
      rf     = "insurance_plan", 
      resp   = resp_var, 
      offset = resp_offset) %>%
  select(rowname, Perm, Term) %>%
  flextable() %>%
  set_header_labels("rowname" = "") %>%
  set_formatter(values = ~ if(is.numeric(.)) sprintf("%.1f%%", . * 100) else .) %>%
  set_caption(caption = paste0("Weighted Average GLM Factors for Variable: insurance_plan in (Perm, Term)"))
Weighted Average GLM Factors for Variable: insurance_plan in (Perm, Term)

Perm

Term

amount_2015vbt

0.9638

0.8340

Factor: insurance_plan

0.8749

0.7596

Ave Fac: dur_band1

0.9098

0.9102

Ave Fac: face_amount_band

0.7913

0.7339

Ave Fac: gender

1.0070

1.0082

Ave Fac: ia_band1

0.9220

0.9325

Ave Fac: iy_band1

0.9105

0.8636

Ave Fac: ltp

0.6733

0.8455

Ave Fac: uw

1.0127

0.9051

Thanks to the GLM model, we can work with a multiplicative formula for prediction. And, with this elegant structure, we can parse out the impact of each individual predictor and make comparisons. The relative impact is represented by the rate of change.

Of these, the movement in uw is the most influential, with ratio 111.9% for Perm over Term. This means that, if all the other predictors are controlled, the average uw factor on a risk-adjusted basis will make Perm mortality prediction approximately 11.9% higher than that of Term. Other influential drivers from this analysis include face amount band, issue year band, and level term period. This may suggest, if actuaries/modelers want to build a simpler model yet still capture essential impact to mortality outcome, they may consider including at least those predictors in the GLM model.

One should nonetheless look to residuals and distributions to ensure that valuable interactions are not being lost. Part of what we are seeing has to do with the different distributions between Perm and Term of uw and face_amount_band. Perm tends to favor 1- and 2-class risk class systems, while Term tends to favor 3- and 4-class systems. Perm also tends to favor lower face amounts, while Term favors higher face amounts.

## create histogram of 2015 VBT Tabular Claims by Count and UW
ds %>%
  filter(insurance_plan %in% c("Perm","Term")) %>%
  group_by(uw,
           insurance_plan) %>%
  summarize(
            AM_policy=sum(policy_actual)/sum(predictions_glm),
            policy_2015vbt=sum(policy_2015vbt)) %>%
  as.data.table() %>%
  ggplot(aes(x = uw)) +
    geom_bar(aes(y = policy_2015vbt), stat = "identity") +
    facet_wrap(facets = vars(insurance_plan)) +
    scale_y_continuous(labels = scales::number, name = "2015 VBT Tabular Claims by Count") +
    theme_minimal() +
    theme(axis.text.x = element_text(angle = 45))

## create histogram of 2015 VBT Tabular Claims by Count and Face Amount
ds %>%
  filter(insurance_plan %in% c("Perm","Term")) %>%
  group_by(face_amount_band,
           insurance_plan) %>%
  summarize(
            AM_policy=sum(policy_actual)/sum(predictions_glm),
            policy_2015vbt=sum(policy_2015vbt)) %>%
  as.data.table() %>%
  ggplot(aes(x = face_amount_band)) +
    geom_bar(aes(y = policy_2015vbt), stat = "identity") +
    facet_wrap(facets = vars(insurance_plan)) +
    scale_y_continuous(labels = scales::number, name = "2015 VBT Tabular Claims by Count") +
    theme_minimal() +
    theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

In light of what we see for distribution, it is unsurprising that the model fits poorly for the Term subset for smaller face amounts and 1- and 2-class systems, while the model fits the Perm subset poorly for 3- and 4-class systems. Since Perm tends to dominate the lower face amounts, model fit is not nearly as poor there as for the Term subset.

## Create graph of Residuals of Actuals versus Count-based Model by UW

ds %>%
  filter(insurance_plan %in% c("Perm","Term")) %>%
  group_by(uw,
           insurance_plan) %>%
  summarize(
            AM_policy=sum(amount_actual)/sum(predictions_glm),
            policy_actual=sum(amount_actual),
            predictions_glm=sum(predictions_glm),
            amount_2015vbt=sum(amount_2015vbt)) %>%
  as.data.table() %>%
  ggplot(aes(x = uw)) +
    geom_point(aes(y = policy_actual / amount_2015vbt), color = "red") +
    geom_point(aes(y = AM_policy, group = 1)) +
    geom_errorbar(aes(ymin = AM_policy - 1.96 * sqrt(glm_disp / predictions_glm), ymax = AM_policy + 1.96 * sqrt(glm_disp / predictions_glm))) +
    geom_hline(yintercept = 1, color = "blue", linetype = 2) +
    facet_wrap(facets = vars(insurance_plan)) +
    scale_y_continuous(labels = scales::percent, name = "Actual-to-Model Ratio") +
    ggtitle("Residuals of Actuals versus Count-based Model") +
    theme_minimal() +
    theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

## Create graph of Residuals of Actuals versus Count-based Model by Face Amount

ds %>%
  filter(insurance_plan %in% c("Perm","Term")) %>%
  group_by(face_amount_band,
           insurance_plan) %>%
  summarize(
            AM_policy=sum(amount_actual)/sum(predictions_glm),
            policy_actual=sum(amount_actual),
            predictions_glm=sum(predictions_glm),
            amount_2015vbt=sum(amount_2015vbt)) %>%
  as.data.table() %>%
  ggplot(aes(x = face_amount_band)) +
    geom_point(aes(y = policy_actual / amount_2015vbt), color = "red") +
    geom_point(aes(y = AM_policy, group = 1)) +
    geom_errorbar(aes(ymin = AM_policy - 1.96 * sqrt(glm_disp / predictions_glm), ymax = AM_policy + 1.96 * sqrt(glm_disp / predictions_glm))) +
    geom_hline(yintercept = 1, color = "blue", linetype = 2) +
    facet_wrap(facets = vars(insurance_plan)) +
    scale_y_continuous(labels = scales::percent, name = "Actual-to-Model Ratio") +
    ggtitle("Residuals of Actuals versus Count-based Model") +
    theme_minimal() +
    theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

Fitting a main-effects GLM on a dataset is a frequently used first step when modeling any dataset. Analyzing residuals from this model and assessing parameter variability by subgroup can reveal useful patterns for further analysis. It is often the case that interactions of effects are present. While a useful starting point, main-effects models cannot capture such interactions effectively. It is therefore necessary to turn to richer models and approaches.

Observations from the Gradient Boosted Decision Tree

Below are box plots of SHAP values for insurance_plan by the other variables.

imp.int2["insurance_plan" == Feature1 | "insurance_plan" == Feature2] %>% 
    head(nPlotTopInteractions) %>%
    select(Feature1,Feature2) %>%
    pivot_longer(cols=c(Feature1,Feature2),
                 names_to=NULL,
                 values_to="Feature") %>%
    distinct() %>%
    filter(Feature != "insurance_plan") -> 
    int.vars

plist <- ilec_shap_plot(
    shp,
    "insurance_plan",
    setdiff(pred_cols,"insurance_plan"),
    resp_var = resp_var,
    resp_offset = resp_offset,
    train.data = train[shp_int_subset]
  )


plist %>%
  iwalk(~ {
    cat('### ',.y,'\n\n')
    print(.x)
    cat('\n\n')
  } )

main effect: insurance_plan

insurance_plan x uw

insurance_plan x face_amount_band

insurance_plan x dur_band1

insurance_plan x ia_band1

insurance_plan x gender

insurance_plan x ltp

insurance_plan x iy_band1

We have noted the interactions with insurance plan from the LightGBM SHAP values as follows:

  1. Main Effect: Perm and Term shap distributions are qualitatively similar, with the xL class higher.
  2. Interaction with underwriting:
    1. Substantial interaction with the “Other” category
    2. For Term, some evidence of higher mortality for N/4/3 and N/4/4
  3. Interaction with face amount band:
    1. No obvious interactions with Perm and Term
    2. Weak evidence for interaction with xL, based on U-shaped pattern in boxplots
  4. Interaction with duration
    1. Weak evidence for elevated mortality in early durations for Perm
    2. Weak evidence for opposite in early durations for Term
    3. Face amounts 1 million and higher for “Other” are plainly different from lower face amount “Other”
  5. Interaction with issue age
    1. Evidence for different issue age slope (relative to 2015VBT) for xL based on downward trend in boxplots
    2. Weak evidence for slight upward issue age slope (relative to 2015VBT) for Term based on trend in boxplots
  6. Interaction with gender: no obvious interaction
  7. Interaction with level term period:
    1. Obviously, no interactions outside of Term
    2. Within Term, “Not Level Term” has lower mortality than the other level term types
  8. Interaction with issue year band:
    1. Since 1990, there is evidence of an upward trend in mortality for all categories outside of “Other”.

Contrasts of Interactions with Insurance Plan from the Elastic Net Model

The elastic net model encodes interesting interactions of insurance plan with other predictor variables. Graphing the contrast between insurance plan types can reveal patterns which are difficult to see when looking at the bare coefficients or tables of factors.

To do so, we can gather the table of factors which include insurance plan and compute the ratio of the factors versus the factors for term. For example, if the marginal factor for Perm males is 99%, and the factor for Term males is 90%, then the ratio is 110%. These contrasts can then be plotted for both males and females across the plan comparisons, as can be seen in the following graphs.

## Combine the list of glmnet interactions into a data.table and filter relevant interactions
data.table(do.call(rbind, glmnet.int.list)) %>%
  filter((Feature1Name == "insurance_plan" | Feature2Name == "insurance_plan") &
         (Feature1Name != "ltp" & Feature2Name != "ltp")) ->
  ints.with.plan

## Extract interaction feature names related to "insurance_plan"
c(ints.with.plan[Feature1Name == "insurance_plan", Feature2Name],
  ints.with.plan[Feature2Name == "insurance_plan", Feature1Name]) ->
  ints.with.plan

## Map over each interaction feature
ints.with.plan %>%
  map(
    .f = \(x) {
      symx <- sym(x)  # Convert feature name to symbol

      ## Create a table of coefficients for the interaction with "insurance_plan"
      tableCVNetCoefs(train.grid,
                      c(x, "insurance_plan"),
                      "Factor",
                      pred_cols,
                      levellist = list("dur_band1" = "04-05")) %>%
        data.table() ->
        tblFacts

      ## Normalize factors by "Term" and pivot longer for ggplot
      cbind(
        tblFacts[, 1],
        tblFacts[, lapply(.SD, "/", Term), .SDcols = colnames(tblFacts)[-1]]
      ) %>%
        pivot_longer(
          cols = colnames(tblFacts)[-1],
          values_to = "Factor",
          names_to = "comparison"
        ) %>%
        filter(comparison != 'Term') %>%
        mutate(comparison = paste0(comparison, " vs. Term")) %>%
        data.table() ->
        dftmp

      ## Special handling if the feature is "uw"
      if (x == "uw") {
        dftmp %>%
          separate(
            col = uw,
            into = c("Smoker_Status", "NClasses", "Class"),
            sep = "/",
            remove = FALSE
          ) ->
          dftmp

        ## Filter out single class and plot
        dftmp %>%
          filter(NClasses != 1) %>%
          ggplot(aes(x = Class, y = Factor)) +
          geom_line(aes(group = comparison, color = comparison)) +
          scale_y_continuous(labels = scales::percent) +
          facet_wrap(Smoker_Status ~ NClasses, labeller = label_both) +
          theme_minimal() +
          ggtitle(
            "Ratio of Insurance Plan Factor to Term Insurance Plan Factor",
            subtitle = paste0("by ", ints.with.plan[3])
          ) -> p
      } else {
        ## General plot for other features
        dftmp %>%
          ggplot(aes(x = !!symx, y = Factor)) +
          geom_line(aes(group = comparison, color = comparison)) +
          scale_y_continuous(labels = scales::percent) +
          theme_minimal() +
          ggtitle(
            "Ratio of Insurance Plan Factor to Term Insurance Plan Factor",
            subtitle = paste0("by ", x)
          ) +
          theme(axis.text.x = element_text(angle = ifelse(x == "face_amount_band", 45, 0))) +
          scale_color_viridis_d() -> p
      }

      p  # Return the plot
    }
  ) %>%
  purrr::set_names(ints.with.plan) %>%
  iwalk(~ {
    cat('### ', .y, '\n\n')
    print(.x)
    cat('\n\n')
  })

iy_band1

uw

face_amount_band

ia_band1

dur_band1

gender

  • It appears that the gap between UL/VL/ULSG/VLSG and term has been narrowing with increasing issue year.
  • xL and Other tend to have a wider spread of factors for face amount than term, while perm has a narrower spread of face amount factors than term.
  • Perm and xL tend to have flatter slope than Term by issue age, except above issue age 65. Above issue age 65, the slope of Perm and xL diverge.
  • Perm tends to have higher duration 2 experience than others.
  • The gender differential for males is narrower for Perm than for term.

A different view helps illustrate the interactions of underwriting and insurance plan. It is easier to see in this view that

  • The residual standard class of a 2-class non-smoker system for Term is much higher than the others.
  • The spread for Term and xL in the 4-class non-smoker system is wider than for Perm and Other.
## Create graph of interaction of Underwriting and Insurance Plan

tableCVNetCoefs(train.grid,
                c("uw", "insurance_plan"),
                "Factor",
                pred_cols,
                levellist = list("dur_band1" = "04-05")) %>%
  pivot_longer(cols = c("Perm", "Term", "Other", "xL"), values_to = "Factor", names_to = "insurance_plan") %>%
  separate(col = uw, into = c("Smoker_Status", "NClasses", "Class"), sep = "/", remove = FALSE) %>%
  data.table() -> tblFacts


tblFacts %>%
  ggplot(aes(x = insurance_plan, y = Factor)) +
    geom_line(aes(group = Class, color = Class)) +
    facet_wrap(Smoker_Status ~ NClasses, labeller = label_both) +
    theme_minimal() +
    scale_y_continuous(labels = scales::percent) +
    scale_color_viridis_d() +
    ggtitle(
      "Interaction of Underwriting and Insurance Plan",
      subtitle = "from the elastic net model"
    )

Summary

In this analysis, we explored the application of a predictive modeling framework within actuarial experience studies, focusing on mortality differences by product type in the ILEC dataset. Our analysis revealed several key insights and mortality differentials that can be useful to understand drivers of mortality. This framework and the findings demonstrated the use and considerations required of predictive modeling and underscore its value in making informed actuarial decisions.

Acknowledgements

Working Group Members

This framework is the result of the tireless efforts of members of the Individual Life Experience Committee, including

  • Philip Adams, FSA (chair)
  • Cynthia Edwalds, FSA
  • Brian Holland, FSA
  • Ed Hui, FSA
  • Michael Niemerg, FSA
  • Haofeng Yu, FSA

Society of Actuaries

The authors would like to thank the staff of the Society of Actuaries for their help throughout this project. Many thanks to Korrel Crawford and Pete Miller.

Appendices

Computational Requirements

## view session info for documentation purposes
sessionInfo()
## R version 4.2.3 (2023-03-15)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.4 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] flexlsx_0.2.1      openxlsx2_1.8      MatrixModels_0.5-1 patchwork_1.1.2   
##  [5] shapviz_0.7.0      here_1.0.1         arrow_11.0.0.3     ftExtra_0.5.0     
##  [9] flextable_0.9.1    dtplyr_1.3.1       magrittr_2.0.3     lubridate_1.9.2   
## [13] forcats_1.0.0      stringr_1.5.0      purrr_1.0.1        readr_2.1.4       
## [17] tibble_3.2.1       tidyverse_2.0.0    doParallel_1.0.17  iterators_1.0.14  
## [21] foreach_1.5.2      tidyr_1.3.0        ggplot2_3.4.2      EIX_1.2.0         
## [25] dplyr_1.1.1        glmnet_4.1-7       Matrix_1.5-3       lmtest_0.9-40     
## [29] zoo_1.8-12         data.table_1.14.8  lightgbm_3.3.5     R6_2.5.1          
## [33] pre_1.0.6         
## 
## loaded via a namespace (and not attached):
##   [1] uuid_1.1-0              backports_1.4.1         systemfonts_1.0.4      
##   [4] plyr_1.8.8              splines_4.2.3           mycor_0.1.1            
##   [7] urltools_1.7.3          digest_0.6.31           htmltools_0.5.5        
##  [10] earth_5.3.2             fansi_1.0.4             tzdb_0.3.0             
##  [13] officer_0.6.2           askpass_1.1             timechange_0.2.0       
##  [16] gfonts_0.2.0            colorspace_2.1-0        ggrepel_0.9.3          
##  [19] ggiraphExtra_0.3.0      textshaping_0.3.6       xfun_0.38              
##  [22] crayon_1.5.2            jsonlite_1.8.4          libcoin_1.0-9          
##  [25] survival_3.5-3          glue_1.6.2              gtable_0.3.3           
##  [28] ppcor_1.1               sjmisc_2.8.9            shape_1.4.6            
##  [31] scales_1.2.1            fontquiver_0.2.1        mvtnorm_1.1-3          
##  [34] Rcpp_1.0.10             plotrix_3.8-2           viridisLite_0.4.1      
##  [37] xtable_1.8-4            bit_4.0.5               Formula_1.2-5          
##  [40] fontLiberation_0.1.0    htmlwidgets_1.6.2       RColorBrewer_1.1-3     
##  [43] ellipsis_0.3.2          pkgconfig_2.0.3         farver_2.1.1           
##  [46] sass_0.4.5              utf8_1.2.3              crul_1.3               
##  [49] tidyselect_1.2.0        labeling_0.4.2          rlang_1.1.0            
##  [52] reshape2_1.4.4          later_1.3.0             munsell_0.5.0          
##  [55] TeachingDemos_2.12      tools_4.2.3             cachem_1.0.7           
##  [58] xgboost_1.7.5.1         cli_3.6.1               generics_0.1.3         
##  [61] sjlabelled_1.2.0        broom_1.0.4             evaluate_0.20          
##  [64] fastmap_1.1.1           yaml_2.3.7              ragg_1.2.5             
##  [67] knitr_1.42              bit64_4.0.5             zip_2.3.0              
##  [70] nlme_3.1-162            mime_0.12               ggiraph_0.8.7          
##  [73] xml2_1.3.3              compiler_4.2.3          rstudioapi_0.14        
##  [76] curl_5.0.0              bslib_0.4.2             stringi_1.7.12         
##  [79] plotmo_3.6.2            DALEX_2.4.3             highr_0.10             
##  [82] iBreakDown_2.0.1        gdtools_0.3.3           lattice_0.20-45        
##  [85] fontBitstreamVera_0.1.1 vctrs_0.6.2             pillar_1.9.0           
##  [88] lifecycle_1.0.3         triebeard_0.4.1         jquerylib_0.1.4        
##  [91] insight_0.19.1          httpuv_1.6.9            promises_1.2.0.1       
##  [94] codetools_0.2-19        MASS_7.3-58.2           assertthat_0.2.1       
##  [97] openssl_2.0.6           rprojroot_2.0.3         withr_2.5.0            
## [100] httpcode_0.3.0          mgcv_1.8-42             hms_1.1.3              
## [103] grid_4.2.3              rpart_4.1.19            rmarkdown_2.21         
## [106] inum_1.0-5              partykit_1.2-20         shiny_1.7.4

References

Corporation, Microsoft, and Steve Weston. 2022. doParallel: Foreach Parallel Adaptor for the ’Parallel’ Package. https://CRAN.R-project.org/package=doParallel.
Dowle, Matt, and Arun Srinivasan. 2023. Data.table: Extension of ‘Data.frame‘. https://CRAN.R-project.org/package=data.table.
Fokkema, Marjolein. 2020. “Fitting Prediction Rule Ensembles with R Package pre.” Journal of Statistical Software 92 (12): 1–30. https://doi.org/10.18637/jss.v092.i12.
Gohel, David, and Panagiotis Skintzos. 2023. Flextable: Functions for Tabular Reporting. https://CRAN.R-project.org/package=flextable.
Maksymiuk, Szymon, Ewelina Karbowiak, and Przemyslaw Biecek. 2021. EIX: Explain Interactions in ’XGBoost’. https://CRAN.R-project.org/package=EIX.
Mayer, Michael. 2023. Shapviz: SHAP Visualizations. https://CRAN.R-project.org/package=shapviz.
Müller, Kirill. 2020. Here: A Simpler Way to Find Your Files. https://CRAN.R-project.org/package=here.
Pedersen, Thomas Lin. 2023. Patchwork: The Composer of Plots. https://CRAN.R-project.org/package=patchwork.
R Core Team. 2023. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Richardson, Neal, Ian Cook, Nic Crane, Dewey Dunnington, Romain François, Jonathan Keane, Dragoş Moldovan-Grünfeld, Jeroen Ooms, and Apache Arrow. 2023. Arrow: Integration to ’Apache’ ’Arrow’. https://CRAN.R-project.org/package=arrow.
Shi, Yu, Guolin Ke, Damien Soukhavong, James Lamb, Qi Meng, Thomas Finley, Taifeng Wang, et al. 2023. Lightgbm: Light Gradient Boosting Machine. https://CRAN.R-project.org/package=lightgbm.
Tay, J. Kenneth, Balasubramanian Narasimhan, and Trevor Hastie. 2023. “Elastic Net Regularization Paths for All Generalized Linear Models.” Journal of Statistical Software 106 (1): 1–31. https://doi.org/10.18637/jss.v106.i01.
Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
Wickham, Hadley, Maximilian Girlich, Mark Fairbanks, and Ryan Dickerson. 2023. Dtplyr: Data Table Back-End for ’Dplyr’. https://CRAN.R-project.org/package=dtplyr.
Zeileis, Achim, and Torsten Hothorn. 2002. “Diagnostic Checking in Regression Relationships.” R News 2 (3): 7–10. https://CRAN.R-project.org/doc/Rnews/.